On Fri, 15 Mar 2002, Gilles Detillieux wrote: > Date: Fri, 15 Mar 2002 16:34:36 -0600 (CST) > From: Gilles Detillieux <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: "ht://Dig developers list" <[EMAIL PROTECTED]> > Subject: Re: [htdig-dev] "file name.html" -> "filename.html";( > > No, the code below does two things: 1) if allow_space_in_url is not > set, the code works like the standard 3.1.x code does, i.e. in strips > out all white space characters, and 2) if allow_space_in_url is set, > the code strips out all white space characters other than the space > itself - for the space character (ASCII 20 hex) it strips leading and > trailing spaces and converts the spaces within the URL to %20. The name > allow_space_in_url is correct, because if the attribute is false, > no spaces are allowed - they're stripped out, just as the currently > released code does, in accordance with RFC 2396. However, if you prefer > encode_space_in_url we can go with that. We're not going to start putting > all sorts of wierd punctuation characters like "%" in attribute names. > > > > static int allowspace = config.Boolean("allow_space_in_url", 0); > > > String temp; > > > while (*ref) > > > { > > > if (*ref == ' ' && temp.length() > 0 && allowspace) > > > { > > > // Replace space character with %20 if there's more non-space > > > // characters to come... > > > char *s = ref+1; > > > while (*s && isspace(*s)) > > > s++; > > > if (*s) > > > temp << "%20"; > > > } > > > else if (!isspace(*ref)) > > > temp << *ref; > > > ref++; > > > } > > Maybe my description of the code above helps you see the rationale more > clearly. The attribute selects both behaviours, not just the encoding. > The reason to make it user-selectable option is that some users may > actually prefer htdig to follow the standards rather than ignore them > like MS/AOL do. > > I'm not sure what you mean by integrating my option in the entire patch. > The code above should be complete on its own, as a change to vanilla > 3.1.6 URL.cc code. You don't need to integrate it with earlier proposed > changes - just put it in both URL methods you were changing before and > make a patch out of it.
I misunderstood. Here is the patch: -------------------------------------8<------------------------------------- *** htlib/URL.cc.031202 Thu Feb 7 17:15:38 2002 --- htlib/URL.cc Fri Mar 15 15:25:27 2002 *************** *** 74,82 **** // URL::URL(char *ref, URL &parent) { ! String temp(ref); ! temp.remove(" \r\n\t"); ! ref = temp; _host = parent._host; _port = parent._port; --- 74,97 ---- // URL::URL(char *ref, URL &parent) { ! static int allowspace = config.Boolean("allow_space_in_url", 0); ! String temp; ! while (*ref) ! { ! if (*ref == ' ' && temp.length() > 0 && allowspace) ! { ! // Replace space character with %20 if there's more non-space ! // characters to come... ! char *s = ref+1; ! while (*s && isspace(*s)) ! s++; ! if (*s) ! temp << "%20"; ! } ! else if (!isspace(*ref)) ! temp << *ref; ! ref++; ! } _host = parent._host; _port = parent._port; *************** *** 243,255 **** } //***************************************************************************** ! // void URL::parse(char *u) // Given a URL string, extract the service, host, port, and path from it. // ! void URL::parse(char *u) { ! String temp(u); ! temp.remove(" \t\r\n"); char *nurl = temp; // --- 258,286 ---- } //***************************************************************************** ! // void URL::parse(char *ref) // Given a URL string, extract the service, host, port, and path from it. // ! void URL::parse(char *ref) { ! static int allowspace = config.Boolean("allow_space_in_url", 0); ! String temp; ! while (*ref) ! { ! if (*ref == ' ' && temp.length() > 0 && allowspace) ! { ! // Replace space character with %20 if there's more non-space ! // characters to come... ! char *s = ref+1; ! while (*s && isspace(*s)) ! s++; ! if (*s) ! temp << "%20"; ! } ! else if (!isspace(*ref)) ! temp << *ref; ! ref++; ! } char *nurl = temp; // -------------------------------------8<------------------------------------- But it failed to follow any link;(I must have misread your instructions;) any ideas? Regards, Joe -- _/ _/_/_/ _/ ____________ __o _/ _/ _/ _/ ______________ _-\<,_ _/ _/ _/_/_/ _/ _/ ......(_)/ (_) _/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED] _______________________________________________ htdig-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/htdig-dev