aseek-devel  

Re: [aseek-devel] parse.cpp.patch

Matt Sullivan
Tue, 26 Aug 2003 22:45:52 +0000

Jens,

This is handled implicitly by lines 884 - 901 for first part and
lines 2199 - 2216 for second in current cvs HEAD.  That is, if the
new URI is relative (does not have schema / host / path) then the
parents schema / host / path are inherited.

Can you give an explicit example of where you see this failing to work.


Thanks,
Matt.

On Tue, 26 Aug 2003 at 18:55:48 +0200, Jens Thoms Toerring wrote:

> This is a patch I haven't mentioned yet. It's actually for two
> separate problems (sorry, Kir).
> 
> The first one is related to "Location:" entries in the header
> the server sends. According to RFC2616 it should followed by
> an absoluteURI, but one some machines there's just an absolute
> path to a different page on the server instead. The first part
> of the patch is to deal with this more gracefully.
> 
> The second patch is for cases where there are links in a document
> that start with a slash, i.e. something like
> 
> <a href="/foo/bar/xxx.html">
> 
> As far as I can see these are not treated correctly (aspseek does
> not seem to follow these links), and that's what the second part
> of the patch is for.
>                                     Regards, Jens
> -- 
>  Freie Universitaet Berlin     Jens Thoms Toerring
>  Universitaetsbibliothek
>  Webteam                       Tel: 0049 30 838 56055
>  Garystrasse 39                Fax: 0049 30 838 53738
>  14195 Berlin                  e-mail: [EMAIL PROTECTED]
> 
> 
> --- aspseek-orig/src/parse.cpp        2003-08-19 13:50:25.000000000 +0200
> +++ aspseek-my/src/parse.cpp  2003-08-26 18:38:21.000000000 +0200
> @@ -876,6 +876,14 @@
>                               string location_unescaped;
>                               char *location_trim = str_trim(location);
>                               URIUnescapeSGML(location_trim, location_unescaped, 
> ucontent.m_charset);
> +
> +                             // If the URI isn't RFC2616 conform, i.e. isn't a 
> absoluteURI
> +                             // but just a path prepend it by the server name in 
> the hope
> +                             // to make it an absoluteURI...
> +
> +                             if ( *location_unescaped.c_str() == '/' )
> +                                     location_unescaped = m_url + 
> location_unescaped;
> +
>                               if (!newURL.ParseURL(location_unescaped.c_str()))
>                               {
>                                       int newMethod;
> @@ -2187,6 +2195,13 @@
>                               string href_unescaped;
>                               char *href_trim = str_trim(href);
>                               URIUnescapeSGML(href_trim, href_unescaped, 
> ucontent->m_charset);
> +
> +                             // Prepend the reference with the server name etc. if 
> it's an
> +                             // absolute path, otherwise we get in trouble later
> +
> +                             if ( *href_unescaped.c_str() == '/' )
> +                                     href_unescaped = CurSrv->m_url + 
> href_unescaped;
> +
>                               if (doc->m_hops >= CurSrv->m_maxhops)
>                               {
>                               }

Attachment: pgp00000.pgp
Description: PGP signature