William Porter created HTTPCLIENT-1486:
------------------------------------------

             Summary: Quirky Behavior in URIUtils leads to Improper Request 
Execution
                 Key: HTTPCLIENT-1486
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1486
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpClient
    Affects Versions: 4.3.3
            Reporter: William Porter
            Priority: Minor


While executing a HttpUriRequest with a ClosableHttpClient, malformed URIs can 
lead to HTTP requests being executed for unexpected resources.  The root issue 
is in the extractHost() method in URIUtils, and is demonstracted by the 
following example.

import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpUriRequest;
import org.apache.http.client.utils.URIUtils;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.log4j.BasicConfigurator;
import org.junit.Assert;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


public class Main {
        
        private static final Logger LOG = LoggerFactory.getLogger(Main.class);

        public static void main(String [] args) {
                
                // Set up Log4J logging
                BasicConfigurator.configure();
                
                try {
                        
                        // The following is a strange URI string that is 
possibly a typo that
                        // doesn't include the / between the authority and the 
'intended' path
                        final String strangeUriString = 
"http://www.example.com:80somepath/someresource.html";;

                        // Whereas it doesn't neccesarily seem like strange 
behavior to resolve the
                        // host and port as www.example.com and 80 from the 
authority, it can have unintended
                        // consequences at higher levels of indirection
                        Assert.assertEquals(new HttpHost("www.example.com", 
80), URIUtils.extractHost(new URI(strangeUriString)));
                        
                        // Now we construct a request with the strange URI 
String
                        HttpUriRequest request = new HttpGet(strangeUriString);
                        
                        // We create a CloseableHttpClient to execute the 
request
                        final HttpClientBuilder builder = 
HttpClientBuilder.create();
                        HttpClient client = builder.build();
                        
                        // Here, the request is executed, but is actually a GET 
/someresource.html
                        // on www.example.com:80 since part of the intended 
path was considered part 
                        // of the authority by the URI class, but disregarded 
by URIUtils
                        final HttpResponse response = client.execute(request);
                        LOG.info("Response: {}", 
response.getStatusLine().toString());
                        
                        
                } catch (final URISyntaxException e) {
                        LOG.error("UriSyntaxException: {}", e.getMessage());
                } catch (final ClientProtocolException e) {
                        LOG.error("ClientProtocolException: {}", 
e.getMessage());
                } catch (final IOException e) {
                        LOG.error("IOException: {}", e.getMessage());
                }
                
        }
}


This bug may be introduced by the fix for 
https://issues.apache.org/jira/browse/HTTPCLIENT-1166.  It might be 
advantageous to throw an exception in this case rather than be lenient with the 
host and port parsing, but further discussion might be merited based on the 
comments in the aforementioned issue. 


Here is some debug output to show the request is actually a GET 
/someresource.html

87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "GET 
/someresource.html HTTP/1.1[\r][\n]"
87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "Host: 
www.example.com:80[\r][\n]"
87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "Connection: 
Keep-Alive[\r][\n]"
87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "User-Agent: 
Apache-HttpClient/4.3.3 (java 1.5)[\r][\n]"
87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "Accept-Encoding: 
gzip,deflate[\r][\n]"
87 [main] DEBUG org.apache.http.wire  - http-outgoing-0 >> "[\r][\n]"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to