Re: [MERGE] Rework urlAbsolute to be a little more streamlined.

2008-09-01 Thread Alex Rousskov
On Mon, 2008-09-01 at 12:42 +1000, Benno Rice wrote:
 Resubmit this patch, including changes based on comments by various people.
 
 - Mention RFC text in relation to changing the default behaviour in relation
   to unknown HTTP methods.
 - Use safe_free instead of xfree.
 - Rework urlAbsolute to use snprintf in a slightly better way.  Snprintf is 
 now
   used to construct the initial portion of the url and the rest is added on
   using POSIX string routines.


 +const char *url, *absUrl;
 +
 +if ((url = rep-header.getStr(hdr)) != NULL) {

The above is better written as

if (const char *url = rep-header.getStr(hdr)) {

It would also be better to name this URL hdrUrl. You already have
reqUrl and absUrl and it is difficult to keep track.

 +const char *url, *absUrl;
...
 +if (absUrl != NULL) {
 +   url = absUrl;
 +}
 +if (absUrl != NULL) { // if the URL was relative, it is by nature the 
 same host
 +   purgeEntriesByUrl(url);
 +} else if (sameUrlHosts(reqUrl, url)) { // prevent purging DoS, per RFC 
 2616 13.10, second last paragraph
 +   purgeEntriesByUrl(url);
 +}
 +if (absUrl != NULL) {
 +   safe_free(absUrl);
 +}

The above looks strange but since urlAbsolute is still not documented, I
may be misinterpreting this code. I would expect something like the code
below, which uses cleanup names:

  if (const char *absUrl = urlRelativeToAbsolute(req, hdrUrl)) {
// hdrUrl was relative so absUrl points to the request host
purgeEntriesByUrl(absUrl);
safe_free(absUrl);
  } else
  if (sameUrlHosts(reqUrl, url)) { // prevent purging DoS; RFC 2616 13.10
purgeEntriesByUrl(url);
  }

If you want to make it even more clear, split urlAbsolute in two:
urlIsRelative and urlMakeAbsolute, arriving at something like 

  if (isRelative(hdrUrl)) {
const char *absUrl = urlMakeAbsolute(req, hdrUrl);
// hdrUrl was relative so absUrl points to the request host
purgeEntriesByUrl(absUrl);
safe_free(absUrl);
  } else
  if (sameUrlHosts(reqUrl, url)) { // prevent purging DoS: RFC 2616 13.10
purgeEntriesByUrl(url);
  }

The above are polishing comments.

I would still insist that urlAbsolute() is documented as its
complicated, multi-purpose interface is impossible to guess by its name.
I think it returns NULL when the URL we feed is already absolute, but I
am not sure. If my understanding if the intent is correct, then
splitting urlAbsolute in two functions as shown above would be
preferred.


 +if (strchr(relUrl, ':') != NULL) {
 +return (NULL);
 +}

According to RFC 2396, Section 3.3 Path Component, a URL path may
include a colon (':') character. This would make the above check wrong.
Did I misread the RFC? Are colons banned from URL paths?


I found this call in urlCanonical():

 if (request-protocol == PROTO_URN) {
 snprintf(urlbuf, MAX_URL, urn:%s, request-urlpath.buf());
 } else {
 /// \todo AYJ: this could use if..else and method == METHOD_CONNECT easier.
 switch (request-method.id()) {
 
 case METHOD_CONNECT:
 snprintf(urlbuf, MAX_URL, %s:%d, request-GetHost(), 
 request-port);
 break;
 
 default:
 portbuf[0] = '\0';
 
 if (request-port != urlDefaultPort(request-protocol))
 snprintf(portbuf, 32, :%d, request-port);
 
 snprintf(urlbuf, MAX_URL, %s://%s%s%s%s%s,
  ProtocolStr[request-protocol],
  request-login,
  *request-login ? @ : null_string,
  request-GetHost(),
  portbuf,
  request-urlpath.buf());
 

this code in urlCanonicalClean():

 if (request-protocol == PROTO_URN) {
 snprintf(buf, MAX_URL, urn:%s, request-urlpath.buf());
 } else {
 /// \todo AYJ: this could use if..else and method == METHOD_CONNECT easier.
 switch (request-method.id()) {
 
 case METHOD_CONNECT:
 snprintf(buf, MAX_URL, %s:%d, 
  request-GetHost(),
  request-port);
 break;
 
 default:
 portbuf[0] = '\0';
 
 if (request-port != urlDefaultPort(request-protocol))
 snprintf(portbuf, 32, :%d, request-port);
 
 loginbuf[0] = '\0';
 
 if ((int) strlen(request-login)  0) {
 strcpy(loginbuf, request-login);
 
 if ((t = strchr(loginbuf, ':')))
 *t = '\0';
 
 strcat(loginbuf, @);
 }
 
 snprintf(buf, MAX_URL, %s://%s%s%s%s,
  ProtocolStr[request-protocol],
  loginbuf,
  request-GetHost(),
  portbuf,
  request-urlpath.buf());

and now there is this code in urlAbsolute (which handles the CONNECT
case earlier so the code below is guaranteed 

Re: pseudo-specs for a String class

2008-09-01 Thread Alex Rousskov
On Sun, 2008-08-31 at 14:07 +0200, Kinkie wrote:

 I've gotten a bit forward, now I'm a bit at a loss about where to go next.

I would move string, stream, and other formatting functions away from
this class. 

I would move searching and comparison functions away from this class.

I would probably remove default conversion from char*. It buys you very
little in Squid low-level Buffer context, but can introduce serious
problems (we have seen some of that during the previous BetterString
attempts).

 char *exportCopy(void);
 char *exportRefIKnowWhatImDoing(void);

What do these return and how do they differ?


 KBuf operator = (char const *S, u_int32_t Ssize);

Does C++ have a tertiary assignment operator?


 KBuf consume(u_int32_t howmuch); //from MemBuf

Since you are not just consuming but creating a new buffer, I would call
this split(), move(), or extract().


 const int operator [] (int pos);

Useless without size() and probably does not belong to a low-level buffer.


There are a few places with methods arguments are references that are
missing const. Please review.


 Current interface:
 
 class KBuf {
 KBuf();
 KBuf(const KBuf S);
 KBuf(const char *S, u_int32_t Ssize);
 KBuf(const char *S); //null-terminated
 ~KBuf();
 bool isNull();
 KBuf operator = (KBuf S);
 KBuf operator = (char const *S, u_int32_t Ssize);
 KBuf operator = (char const *S);  //null-terminated
 KBuf append(KBuf S);
 KBuf append(const char * S, u_int32_t Slen);
 KBuf append(const char * S); //null-terminated
 KBuf append(const char c);   //To be removed?
 KBuf appendf(const char *fmt, ...); //to be copied over from membuf
 std::ostream  print (std::ostream os); // for operator
 void dump(ostream os); //dump debugging info
 const int operator [] (int pos);
 int cmp(KBuf S); //strcmp()
 bool operator == (const KBuf  S);
 bool operator (KBuf S);
 bool operator (KBuf S);
 void truncate(u_int32_t to_size);
 KBuf consume(u_int32_t howmuch); //from MemBuf
 void terminate(void); //null-terminate
 static ostream  stats(ostream os);
 char *exportCopy(void);
 char *exportRefIKnowWhatImDoing(void);
 KBuf nextToken(const char *delim); //strtok()
 KBuf substr(u_int32_t from, u_int32_t to);
 u_int32_t index(char c);
 u_int32_t rindex(char c);
 }
 
 on x86, sizeof(KBuf)=16
 
 Now I'm a bit at a loss as to how to best integrate with iostream.
 There's basically three possibilities:
 
 1. KBuf kb; kb.append(stringstream )
cheapest implementation, but each of those requires two to three
 copies of the contents

I do not think your buffer should know about stringstream.

 2. Kbuf kb; stringstream ss=kb-stream(); ss (blah).
this seems to be the official way of extending iostreams;
 performed by making KBuf a subclass of stringbuf.
extends sizeof(KBuf) by 32 bytes, and many of its calls need to
 housekeep two states.

Do not do that, at least not for now. Your buffer is not a stream and
probably is not a stream buffer either.

 3. Kbuf kb; stringstream ss=kb-stream(); ss  (blah)
 performed by using an adapter class. The coding effort starts to
 be quite noticeable, as keeping the stringbuf and the KBuf in sync is
 not trivial.

Same as #2.

 4 Kbuf kb(blah). requires kbuf to be a subclass of an ostream.
there's a significant coding effort, AND baloons the size of KBuf
 to 156 bytes.

#4 does not really require your class to be a child of ostream, but I do
not think we need this either.


 What's your take on how to better address this?

IMO, you should not address formatted output in your buffer.

If you insist on placing low-level buffering functions and high-level
string functions together, then you can add a few  operators that
would append a few commonly used types to the buffer. Adding those
operators will not increase the memory footprint of your class.

HTH,

Alex.




Re: pseudo-specs for a String class

2008-09-01 Thread Alex Rousskov
On Sun, 2008-08-31 at 22:40 +0200, Kinkie wrote:
 On Sun, Aug 31, 2008 at 6:00 PM, Adrian Chadd [EMAIL PROTECTED] wrote:
  Do you really want to provide a 'consume' interface for a low-level
  representation of memory?
 
  I think trying to replace MemBuf with this new buffer is a bit silly.
  Sure, use it -in- MemBuf, along with all the other places that buffers
  are used.
 
  What about strtok()? Why would you want to tokenise data?
 
 I'd like it to be used
 - everywhere MemBuf is used
 - everywhere a char* is used for a string
 - strtok, strchr, strstr, etc to be used in the HTTP parser
 
 All of the above with increased efficiency both in terms callers' LOC
 and performance.

 Given:
 KBuf src=GET http://some.url/ HTTP/1.0\r\nHost: some.url\r\n\r\n
 KBuf out;
 
 The following are expected to be equivalent:
 
 u_int32_t pos=src.index('\n');
 out=src.substr(0,pos);
 src=src.substr(pos,-1); //to end-of-string
 
 out=src.consume(src.index('\n'));
 
 out=src.nextToken('\n');
 

IMO, manipulations like the above should be done in a String class that
uses a low-level Buffer. This would yield same callers LOC (not that I
care much about that) and the same performance as your solution, but
would allow us to separate the two rather different concepts and
optimize/extend each separately.

Thank you,

Alex.





Re: [MERGE] Rework urlAbsolute to be a little more streamlined.

2008-09-01 Thread Adrian Chadd
one zeroes, one doesn't.
calloc is meant to be x objects of size y, but its effectively also a bzero().


Adrian

2008/9/2 Amos Jeffries [EMAIL PROTECTED]:

 On 01/09/2008, at 1:01 PM, Amos Jeffries wrote:

 Resubmit this patch, including changes based on comments by various
 people.

 - Mention RFC text in relation to changing the default behaviour in
 relation
  to unknown HTTP methods.
 - Use safe_free instead of xfree.
 - Rework urlAbsolute to use snprintf in a slightly better way.
 Snprintf
 is now
  used to construct the initial portion of the url and the rest is
 added
 on
  using POSIX string routines.


 I'm sure you can still crop the xstrdup() usage by adding a JIT
 allocation
 of urlbuf before if (req-protocol == PROTO_URN)
 and returning urlbuf plain at the end.

 As in malloc it?

 Yes with xmalloc or xcalloc. (I'm still not sure why we have two).

 Amos





Re: [MERGE] Rework urlAbsolute to be a little more streamlined.

2008-09-01 Thread Amos Jeffries
 On Mon, 2008-09-01 at 12:42 +1000, Benno Rice wrote:
 Resubmit this patch, including changes based on comments by various
 people.

 - Mention RFC text in relation to changing the default behaviour in
 relation
   to unknown HTTP methods.
 - Use safe_free instead of xfree.
 - Rework urlAbsolute to use snprintf in a slightly better way.  Snprintf
 is now
   used to construct the initial portion of the url and the rest is added
 on
   using POSIX string routines.


 +const char *url, *absUrl;
 +
 +if ((url = rep-header.getStr(hdr)) != NULL) {

 The above is better written as

 if (const char *url = rep-header.getStr(hdr)) {

 It would also be better to name this URL hdrUrl. You already have
 reqUrl and absUrl and it is difficult to keep track.

 +const char *url, *absUrl;
 ...
 +if (absUrl != NULL) {
 +   url = absUrl;
 +}
 +if (absUrl != NULL) { // if the URL was relative, it is by nature
 the same host
 +   purgeEntriesByUrl(url);
 +} else if (sameUrlHosts(reqUrl, url)) { // prevent purging DoS, per
 RFC 2616 13.10, second last paragraph
 +   purgeEntriesByUrl(url);
 +}
 +if (absUrl != NULL) {
 +   safe_free(absUrl);
 +}

 The above looks strange but since urlAbsolute is still not documented, I
 may be misinterpreting this code. I would expect something like the code
 below, which uses cleanup names:

   if (const char *absUrl = urlRelativeToAbsolute(req, hdrUrl)) {
   // hdrUrl was relative so absUrl points to the request host
   purgeEntriesByUrl(absUrl);
   safe_free(absUrl);
   } else
   if (sameUrlHosts(reqUrl, url)) { // prevent purging DoS; RFC 2616 13.10
   purgeEntriesByUrl(url);
   }

Thanks for picking that up Alex.

Benno the safe_free() does the NULL case handling itself, sorry for not
being clear.


 If you want to make it even more clear, split urlAbsolute in two:
 urlIsRelative and urlMakeAbsolute, arriving at something like

   if (isRelative(hdrUrl)) {
   const char *absUrl = urlMakeAbsolute(req, hdrUrl);
   // hdrUrl was relative so absUrl points to the request host
   purgeEntriesByUrl(absUrl);
   safe_free(absUrl);
   } else
   if (sameUrlHosts(reqUrl, url)) { // prevent purging DoS: RFC 2616 13.10
   purgeEntriesByUrl(url);
   }

 The above are polishing comments.

 I would still insist that urlAbsolute() is documented as its
 complicated, multi-purpose interface is impossible to guess by its name.
 I think it returns NULL when the URL we feed is already absolute, but I
 am not sure. If my understanding if the intent is correct, then
 splitting urlAbsolute in two functions as shown above would be
 preferred.


 +if (strchr(relUrl, ':') != NULL) {
 +return (NULL);
 +}

 According to RFC 2396, Section 3.3 Path Component, a URL path may
 include a colon (':') character. This would make the above check wrong.
 Did I misread the RFC? Are colons banned from URL paths?


 I found this call in urlCanonical():

 if (request-protocol == PROTO_URN) {
 snprintf(urlbuf, MAX_URL, urn:%s, request-urlpath.buf());
 } else {
 /// \todo AYJ: this could use if..else and method == METHOD_CONNECT
 easier.
 switch (request-method.id()) {

 case METHOD_CONNECT:
 snprintf(urlbuf, MAX_URL, %s:%d, request-GetHost(),
 request-port);
 break;

 default:
 portbuf[0] = '\0';

 if (request-port != urlDefaultPort(request-protocol))
 snprintf(portbuf, 32, :%d, request-port);

 snprintf(urlbuf, MAX_URL, %s://%s%s%s%s%s,
  ProtocolStr[request-protocol],
  request-login,
  *request-login ? @ : null_string,
  request-GetHost(),
  portbuf,
  request-urlpath.buf());


 this code in urlCanonicalClean():

 if (request-protocol == PROTO_URN) {
 snprintf(buf, MAX_URL, urn:%s, request-urlpath.buf());
 } else {
 /// \todo AYJ: this could use if..else and method == METHOD_CONNECT
 easier.
 switch (request-method.id()) {

 case METHOD_CONNECT:
 snprintf(buf, MAX_URL, %s:%d,
  request-GetHost(),
  request-port);
 break;

 default:
 portbuf[0] = '\0';

 if (request-port != urlDefaultPort(request-protocol))
 snprintf(portbuf, 32, :%d, request-port);

 loginbuf[0] = '\0';

 if ((int) strlen(request-login)  0) {
 strcpy(loginbuf, request-login);

 if ((t = strchr(loginbuf, ':')))
 *t = '\0';

 strcat(loginbuf, @);
 }

 snprintf(buf, MAX_URL, %s://%s%s%s%s,
  ProtocolStr[request-protocol],
  loginbuf,
  request-GetHost(),
  portbuf,
  

Re: Refresh patterns and ACLs

2008-09-01 Thread Mark Nottingham


On 29/08/2008, at 11:20 PM, Henrik Nordstrom wrote:


On fre, 2008-08-29 at 10:44 +1000, Mark Nottingham wrote:

I'm not convinced it's a great solution, but something like URISpace
may be appropriate;
  http://www.w3.org/TR/urispace.html


Not very different in function, besides being XML..

However, the URISpace xml tag space is a bit too limited, so it  
needs to

be extended a lot. And it's focus is a little different from what we
need in a general sense (but may fit reasonably well for a reverse  
proxy

or web server).


The whole point is that it's extensible. Being XML makes it portable,  
encourages tool development, etc. (which is what we were looking for  
at Akamai).


Again, I'm not saying that this is the answer -- IMO URISpace is too  
compex (although so is the problem space); I just despair of having  
yet another solution in this space; see


http://www.w3.org/2007/powder/
http://xrds-simple.net/
http://www.sitemaps.org/

ad nauseum...





What's nice about this is that you buy some efficiency by walking  
down

the tree, rather than evaluating a linear set of rules...


Same.

Regards
Henrik


--
Mark Nottingham   [EMAIL PROTECTED]




Re: [squid-users] binary data in cache.log with squid 3.0

2008-09-01 Thread Amos Jeffries


 Amos Jeffries escreveu:
 Leonardo Rodrigues Magalhães wrote:

i dont know if this is expected or even desired  but with
 squid 3.0 i'm getting some log entries that brings lots of binary
 data to the log. This binary stuff makes it impossible, sometimes, to
 watch logs with 'tail -f' or even 'cat'. When binary data is
 presented, my terminal just get completly messed up and i have to
 disconnect and reconnect to the machine.

Logs entries are always 'Unsupported method in request' entries.

i skipped directly from 2.5 to 3.0 ... and never had this kind of
 problem (binary data in cache.log) with squid 2.5. Dont know what
 would be the behavior with 2.6 or 2.7 .

in squid 3.0 i'm getting something similar to:

 2008/08/05 11:01:32| clientParseRequestMethod: Unsupported method in
 request 'LTS OF BINARY DATA HERE'
 2008/08/05 11:01:32| clientProcessRequest: Invalid Request




 The unsupported method bit is solved in 3-HEAD/3.1.

 I've just committed a hack to 3.0 which converts the binary codes to
 harmless ascii and truncates the binary stream to something usable.


 Hi Amos,

 Sorry for bringing back this old thread . but i'm facing the
 same problem (binary data in cache.log) i was facing with 3.0Stable8 but
 now with 2.7Stable4.

 is it possible to port that 3.0 hack to 2.7 tree ?


That would be up to Henrik who maintains Squid 2.x
I can't see any reasons why not though. I've posted a proposal to
squid-dev about it.

Amos




Re: binary data

2008-09-01 Thread Adrian Chadd
Erk! I just read that patch!

man isprint. Oh, and man ctype. I'm sure there's a C++ equivalent
somewhere which makes more sense to use.


Adrian

2008/9/2 Amos Jeffries [EMAIL PROTECTED]:
 Henrik,
  Theres been some user interest in porting the binary data hack

 http://www.squid-cache.org/Versions/v3/3.0/changesets/b8877.patch

 to 2.7. How say you?

 Amos





Re: pseudo-specs for a String class

2008-09-01 Thread Amos Jeffries
 On Sun, 2008-08-31 at 14:07 +0200, Kinkie wrote:

 I've gotten a bit forward, now I'm a bit at a loss about where to go
 next.

 I would move string, stream, and other formatting functions away from
 this class.

 I would move searching and comparison functions away from this class.

 I would probably remove default conversion from char*. It buys you very
 little in Squid low-level Buffer context, but can introduce serious
 problems (we have seen some of that during the previous BetterString
 attempts).

 char *exportCopy(void);
 char *exportRefIKnowWhatImDoing(void);

 What do these return and how do they differ?

I though they were
 exportCopy == xstrdup() - or similar anyway.

other wone is just internal buffer export.
Kinkie, I'd make that const and re-work the users that would have it
otherwise.


 KBuf operator = (char const *S, u_int32_t Ssize);

 Does C++ have a tertiary assignment operator?

No. That should be a constructor.

 KBuf out = KBuf(S,strlen(S));



 KBuf consume(u_int32_t howmuch); //from MemBuf

 Since you are not just consuming but creating a new buffer, I would call
 this split(), move(), or extract().


 const int operator [] (int pos);

 Useless without size() and probably does not belong to a low-level buffer.


Also, [] on a string should not return int. It should return whatever a
single character representation is, char or BYTE, etc.

There are two versions of this needed;

  One: const char ...() const;  for read-only access to a char. With tests
returning NULL char or other safe abortion for out-of-range requests.

  The other: char ...();  for write-access, with full tests for string
expansion if necessary to store the new data.


 There are a few places with methods arguments are references that are
 missing const. Please review.


Here are some hints...


 Current interface:

 class KBuf {
 KBuf();
 KBuf(const KBuf S);
 KBuf(const char *S, u_int32_t Ssize);
 KBuf(const char *S); //null-terminated
 ~KBuf();
 bool isNull();

() const;

 KBuf operator = (KBuf S);
 KBuf operator = (char const *S, u_int32_t Ssize);

invalid. The code for this needs to use  ... = KBuf(S, Ssize);

 KBuf operator = (char const *S);  //null-terminated
 KBuf append(KBuf S);
 KBuf append(const char * S, u_int32_t Slen);
 KBuf append(const char * S); //null-terminated
 KBuf append(const char c);   //To be removed?

 const int operator [] (int pos);
 int cmp(KBuf S); //strcmp()

booleans are all:  ...()const;

 bool operator == (const KBuf  S);
 bool operator (KBuf S);
 bool operator (KBuf S);
 void truncate(u_int32_t to_size);
 KBuf consume(u_int32_t howmuch); //from MemBuf
 void terminate(void); //null-terminate
 char *exportCopy(void);
 char *exportRefIKnowWhatImDoing(void);
 KBuf nextToken(const char *delim); //strtok()
 KBuf substr(u_int32_t from, u_int32_t to);

 u_int32_t index(char c);
 u_int32_t rindex(char c);

what about 'absent' case usually referred by signed -1.

 }

These following are where all the tricky bits come in. What are they
actually needed for?

 std::ostream  print (std::ostream os); // for operator
 void dump(ostream os); //dump debugging info
 static ostream  stats(ostream os);
 KBuf appendf(const char *fmt, ...); //to be copied over from membuf

Obsolete if you want to implement a seperate KBufStream object taking
items and formatting them into a KBuf.


 on x86, sizeof(KBuf)=16

 Now I'm a bit at a loss as to how to best integrate with iostream.

Single inline-able function:

 ostream operator (ostream os, KBuf const S) {
os  KBuf.exportRefIKnowWhatImDoing();
 }

The rest is handled by special handlers not related directly to KBuf as
Alex pointed out, but may use its other API calls to set data.


 There's basically three possibilities:

 1. KBuf kb; kb.append(stringstream )
cheapest implementation, but each of those requires two to three
 copies of the contents

 I do not think your buffer should know about stringstream.

 2. Kbuf kb; stringstream ss=kb-stream(); ss (blah).
this seems to be the official way of extending iostreams;
 performed by making KBuf a subclass of stringbuf.
extends sizeof(KBuf) by 32 bytes, and many of its calls need to
 housekeep two states.

 Do not do that, at least not for now. Your buffer is not a stream and
 probably is not a stream buffer either.

 3. Kbuf kb; stringstream ss=kb-stream(); ss  (blah)
 performed by using an adapter class. The coding effort starts to
 be quite noticeable, as keeping the stringbuf and the KBuf in sync is
 not trivial.

 Same as #2.

 4 Kbuf kb(blah). requires kbuf to be a subclass of an ostream.
there's a significant coding effort, AND baloons the size of KBuf
 to 156 bytes.

 #4 does not really require your class to be a child of ostream, but I do
 not think we need this either.


 What's your take on how to better address this?

 IMO, you should not address 

Re: [MERGE] Rework urlAbsolute to be a little more streamlined.

2008-09-01 Thread Alex Rousskov
On Tue, 2008-09-02 at 13:44 +1200, Amos Jeffries wrote:

  There got to be some way to remove this code duplication (and related
  inconsistencies)!
 
 That duplication and a lot more URL mess is all covered by Bug 1961.
 http://www.squid-cache.org/bugs/show_bug.cgi?id=1961
 
 Since the original request from Benno was for Vary invalidation. I think
 we can sign off on that merge when the last memory tweaks of urlAbsolute
 are there.
 
 But ignore the duplication until URL cleanup is done, which will be easier
 after the string buffer Kinkie is working on now goes in.

I think it would be better if the patch that introduces more duplication
and more inconsistencies would instead remove at least some of them,
especially since nobody is working on the related bug1961. The leave
things better than you found them approach often works well, IMO, and
in this particular case, the amount of extra clean up work is minor.

However, I am not going to block this particular submission on the
significantly increases code duplication grounds. 

We should probably add an XXX comment documenting the duplication of
code in urlRelativeToAbsolute.

Thank you,

Alex.




Re: pseudo-specs for a String class

2008-09-01 Thread Kinkie
Thanks for this extra feedback. I appreciate the advice everyone is giving.
I'll have limited access to the net for the next couple of days so I
won't be able to answer to the points you make right away, I should
have the time to incorporate some of your suggestions by then.
 Are you!



On 9/2/08, Alex Rousskov [EMAIL PROTECTED] wrote:
 On Tue, 2008-09-02 at 05:10 +0200, Kinkie wrote:
 On Tue, Sep 2, 2008 at 12:21 AM, Alex Rousskov
 [EMAIL PROTECTED] wrote:
  On Sun, 2008-08-31 at 14:07 +0200, Kinkie wrote:
 
  I've gotten a bit forward, now I'm a bit at a loss about where to go
  next.
 
  I would move string, stream, and other formatting functions away from
  this class.
 
  I would move searching and comparison functions away from this class.
 
  I would probably remove default conversion from char*. It buys you very
  little in Squid low-level Buffer context, but can introduce serious
  problems (we have seen some of that during the previous BetterString
  attempts).

 I'm trying to make the class as useable as possible.
 Subclassing to add those operations is of course a possibility, I
 invite you to check the code out (it has a quite rich test/example
 section).

 Test/examples do not matter in this particular case. What matters is the
 unexpected and expensive implicit conversions that are pretty much
 guaranteed to appear in the real code if implicit conversion from char*
 is allowed. We cannot avoid all implicit conversions, but I would still
 recommend removing the implicit conversion from char* from any
 buffering- or string-related class.

  char *exportCopy(void);
  char *exportRefIKnowWhatImDoing(void);
 
  What do these return and how do they differ?

 The first exports a copy of the KBuf contents (potentially expensive
 but clear from the point of view of memory management), the latter
 exports a reference to the internal storage. Very cheap but quite
 dangerous as the underlying storage may be moved away. Thus the name
 expresses a contract by which the caller declaares to know what she's
 doing.

 Understood. Please document:

  - the mechanism for destroying/freeing a safe copy,
  - whether the safe copy is null-terminated,
  - the scope where it is safe to use the IKnowWhatImDoing reference.

  KBuf operator = (char const *S, u_int32_t Ssize);
 
  Does C++ have a tertiary assignment operator?

 of course not; please conosider that interface is a mock-up. Actual
 functions are:

 assign(KBuf )
 operator = (KBuf )
 assign (const char*, size_t)
 assign (const char*)
 operator = (const char *)

 OK. The conversion from char* string and lack of const comments apply
 to all of these but the third one then.

  const int operator [] (int pos);
 
  Useless without size() and probably does not belong to a low-level
  buffer.

 size() has been implemented since.
 That operator returns -1 (EOF) on out-of-bounds. I agree it's probably
 useless, I'm know of throwing all possible ideas in, removing calls is
 always possible.

 I agree that removing something is always possible. On the other hand,
 reviewing all possible ideas APIs is painful and not very productive.
 Also, it is usually easier to add than to remove.

  There are a few places with methods arguments are references that are
  missing const. Please review.

 Ok. Are you referring to the mockup or did you check the actual code?

 I am referring to the Current interface blob you posted, asking about
 the direction to go next. If the actual interface is not the current
 interface, please post whatever we should be looking at.


 Four more comments on the current interface blob. Sorry if these are
 irrelevant to the actual work.

 * Please document the results of invalid operations such as appending
 more data than would fit or truncating more than there is. I would
 recommend throwing an exception as the starting point, but whatever the
 final/agreed solution is it needs to be documented and discussed.

 * bool isNull();

 Unlike for a pointer, being null is a strange state for a buffer.
 Please consider renaming that method to isEmpty (if that is what you
 meant). BTW, MemBuf::isNull is not isEmpty!

 * KBuf nextToken(const char *delim); //strtok()

 The above API would require removing the token from the object or
 keeping the token pointer as additional data member. Neither approach
 allows multiple concurrent string iterations. Is that intentional?
 Should data iteration be external to the object holding the data?

 * Many methods that probably do not modify the object are missing
 const at the end.


 As already discussed, I have no objections to splitting the class'
 interface in low-level and high-level.
 I'm currently taking the pragmatic approach that the high-level
 functions going to be almost always used anyways, so might as well
 clump them together.

 Not sure why clumping high- and low-level interfaces together is
 pragmatic, but when do you expect to make the decision on whether the
 final interface you recommend would be 

Re: pseudo-specs for a String class

2008-09-01 Thread Kinkie
Er.. That was a 'see you', as mangled by a BlackBerry.



On 9/2/08, Kinkie [EMAIL PROTECTED] wrote:
 Thanks for this extra feedback. I appreciate the advice everyone is giving.
 I'll have limited access to the net for the next couple of days so I
 won't be able to answer to the points you make right away, I should
 have the time to incorporate some of your suggestions by then.
  Are you!



 On 9/2/08, Alex Rousskov [EMAIL PROTECTED] wrote:
 On Tue, 2008-09-02 at 05:10 +0200, Kinkie wrote:
 On Tue, Sep 2, 2008 at 12:21 AM, Alex Rousskov
 [EMAIL PROTECTED] wrote:
  On Sun, 2008-08-31 at 14:07 +0200, Kinkie wrote:
 
  I've gotten a bit forward, now I'm a bit at a loss about where to go
  next.
 
  I would move string, stream, and other formatting functions away from
  this class.
 
  I would move searching and comparison functions away from this class.
 
  I would probably remove default conversion from char*. It buys you very
  little in Squid low-level Buffer context, but can introduce serious
  problems (we have seen some of that during the previous BetterString
  attempts).

 I'm trying to make the class as useable as possible.
 Subclassing to add those operations is of course a possibility, I
 invite you to check the code out (it has a quite rich test/example
 section).

 Test/examples do not matter in this particular case. What matters is the
 unexpected and expensive implicit conversions that are pretty much
 guaranteed to appear in the real code if implicit conversion from char*
 is allowed. We cannot avoid all implicit conversions, but I would still
 recommend removing the implicit conversion from char* from any
 buffering- or string-related class.

  char *exportCopy(void);
  char *exportRefIKnowWhatImDoing(void);
 
  What do these return and how do they differ?

 The first exports a copy of the KBuf contents (potentially expensive
 but clear from the point of view of memory management), the latter
 exports a reference to the internal storage. Very cheap but quite
 dangerous as the underlying storage may be moved away. Thus the name
 expresses a contract by which the caller declaares to know what she's
 doing.

 Understood. Please document:

  - the mechanism for destroying/freeing a safe copy,
  - whether the safe copy is null-terminated,
  - the scope where it is safe to use the IKnowWhatImDoing reference.

  KBuf operator = (char const *S, u_int32_t Ssize);
 
  Does C++ have a tertiary assignment operator?

 of course not; please conosider that interface is a mock-up. Actual
 functions are:

 assign(KBuf )
 operator = (KBuf )
 assign (const char*, size_t)
 assign (const char*)
 operator = (const char *)

 OK. The conversion from char* string and lack of const comments apply
 to all of these but the third one then.

  const int operator [] (int pos);
 
  Useless without size() and probably does not belong to a low-level
  buffer.

 size() has been implemented since.
 That operator returns -1 (EOF) on out-of-bounds. I agree it's probably
 useless, I'm know of throwing all possible ideas in, removing calls is
 always possible.

 I agree that removing something is always possible. On the other hand,
 reviewing all possible ideas APIs is painful and not very productive.
 Also, it is usually easier to add than to remove.

  There are a few places with methods arguments are references that are
  missing const. Please review.

 Ok. Are you referring to the mockup or did you check the actual code?

 I am referring to the Current interface blob you posted, asking about
 the direction to go next. If the actual interface is not the current
 interface, please post whatever we should be looking at.


 Four more comments on the current interface blob. Sorry if these are
 irrelevant to the actual work.

 * Please document the results of invalid operations such as appending
 more data than would fit or truncating more than there is. I would
 recommend throwing an exception as the starting point, but whatever the
 final/agreed solution is it needs to be documented and discussed.

 * bool isNull();

 Unlike for a pointer, being null is a strange state for a buffer.
 Please consider renaming that method to isEmpty (if that is what you
 meant). BTW, MemBuf::isNull is not isEmpty!

 * KBuf nextToken(const char *delim); //strtok()

 The above API would require removing the token from the object or
 keeping the token pointer as additional data member. Neither approach
 allows multiple concurrent string iterations. Is that intentional?
 Should data iteration be external to the object holding the data?

 * Many methods that probably do not modify the object are missing
 const at the end.


 As already discussed, I have no objections to splitting the class'
 interface in low-level and high-level.
 I'm currently taking the pragmatic approach that the high-level
 functions going to be almost always used anyways, so might as well
 clump them together.

 Not sure why clumping high- and low-level interfaces together is
 

Re: [MERGE] Bug #2447: Problems on failed DNS TCP connections

2008-09-01 Thread Martin Pool

Martin Pool has voted comment.
Status is now: Waiting
Comment:
retargetting...

For details, see: 
http://bundlebuggy.aaronbentley.com/project/squid/request/%3C1220137897.6703.4.camel%40henriknordstrom.net%3E

Project: Squid