well, before they correct this on the server-side-configuration,i still need to handle this as "tolerant" as possible, that means get the content of the site just like telnet/wget or browser does. i want to make httpclient, after eg. 2 times get redirected with an invalid uri, just user //path-whatever instead of trying to cut off to /path-whatever..
thanx On Thu, Jun 30, 2011 at 2:05 PM, Oleg Kalnichevski <[email protected]> wrote: > On Thu, 2011-06-30 at 11:45 +0200, khiem nguyen wrote: > > i dont think it's problem of redirect here, > > > Well, it is. The redirect location is invalid and leads to the following > request having an ambiguous request-URI > > > i'm using httclient for proxying > > request from browser & just handle redirect-url back to browser , which > in > > turn always the same, httpclient fires /Sale.... instead of //Sale ..., > > server redirect with ...//Sale/... again > > > > //Sale is not a valid URI. > > > where can override this behavior ? > > See my previous message. > > Oleg > > > thanx alot > > > > > > On Wed, Jun 29, 2011 at 9:10 PM, Oleg Kalnichevski <[email protected]> > wrote: > > > > > On Wed, 2011-06-29 at 12:21 +0200, khiem nguyen wrote: > > > > Hi, i tried to retrieve the content of this link: > > > > > > > > http://de.tommy.com//Sale/600000,de_DE,sc.html > > > > > > > > > > > > & got circular redirect, logging tells me that httpclient fires : GET > > > > /Sale/600000,de_DE,sc.html > > > > server response with redirect back to > > > > http://de.tommy.com//Sale/600000,de_DE,sc.html > > > > > > > > wget behaves like browser & gives back the content. > > > > > > > > > > > > with telnet: > > > > > > > > > > > > telnet de.tommy.com 80 > > > > Trying 89.202.105.72... > > > > Connected to de.tommy.com. > > > > Escape character is '^]'. > > > > GET /Sale/600000,de_DE,sc.html HTTP/1.1 > > > > Host:de.tommy.com > > > > > > > > HTTP/1.1 301 Moved Permanently > > > > Date: Wed, 29 Jun 2011 10:11:15 GMT > > > > Server: Apache > > > > Content-Length: 0 > > > > Set-Cookie: dwsid= > > > > > > > > CvVvWMuShdGfstjxicXY9lJb8Fk8gkMT8xV8zGEU_X1Y81Rt4F-469BS_cTJZ4hHcE7f5NVeacb1VKcXHFEKGg==; > > > > path=/; HttpOnly > > > > Cache-Control: no-cache,no-store,must-revalidate > > > > Pragma: no-cache > > > > Expires: Thu, 01 Dec 1994 16:00:00 GMT > > > > Location: http://de.tommy.com//Sale/600000,de_DE,sc.html > > > > Vary: Accept-Encoding > > > > Accept-Ranges: bytes > > > > Content-Type: text/plain > > > > > > > > Connection closed by foreign host. > > > > ----- > > > > > > > > > > > > de.tommy.com 80 > > > > Trying 89.202.105.72... > > > > Connected to de.tommy.com. > > > > Escape character is '^]'. > > > > GET //Sale/600000,de_DE,sc.html HTTP/1.1 > > > > Host: de.tommy.com > > > > > > > > HTTP/1.1 200 OK > > > > Date: Wed, 29 Jun 2011 10:07:11 GMT > > > > Server: Apache > > > > Set-Cookie: .... > > > > ....content > > > > > > > > > > > > ... > > > > > > > > seems like httpclient strip out one of the 2 slashes. > > > > is it a bug or the server is misconfigured ( i guess they use rewrite > or > > > > something but its not rare) > > > > > > > > how can i fix this ? > > > > thanx > > > > > > The redirect returned by the server is malformed > > > > > > http://www.ietf.org/rfc/rfc2396.txt > > > > > > --- > > > 3.3. Path Component > > > > > > The path component contains data, specific to the authority (or the > > > scheme if there is no authority component), identifying the resource > > > within the scope of that scheme and authority. > > > > > > path = [ abs_path | opaque_part ] > > > > > > path_segments = segment *( "/" segment ) > > > segment = *pchar *( ";" param ) > > > param = *pchar > > > > > > pchar = unreserved | escaped | > > > ":" | "@" | "&" | "=" | "+" | "$" | "," > > > > > > The path may consist of a sequence of path segments separated by a > > > single slash "/" character. Within a path segment, the characters > > > "/", ";", "=", and "?" are reserved. Each path segment may include a > > > sequence of parameters, indicated by the semicolon ";" character. > > > The parameters are not significant to the parsing of relative > > > references. > > > > > > --- > > > The path element of the URI is not supposed to have multiple > consecutive > > > slashes. Such URIs are ambiguous and whichever way HttpClient tries to > > > normalize them it cannot get it right all the time. You have two > options > > > here: turning off automatic redirect and handling redirects manually or > > > building a custom RedirectStrategy. > > > > > > Hope this helps > > > > > > Oleg > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
