Yihua Huang created HTTPCLIENT-1432:
---------------------------------------

             Summary: Lazy decompressing of HttpEntity.getContent()
                 Key: HTTPCLIENT-1432
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1432
             Project: HttpComponents HttpClient
          Issue Type: Improvement
          Components: HttpClient
    Affects Versions: 4.3.1, 4.3.2
            Reporter: Yihua Huang
            Priority: Minor


In 4.3, DecompressingEntity is used for decompressing entity of http response. 
When we call DecompressingEntity.getContent(), an new DeflateInputStream or 
GZIPInputStream will be created, and the header of compressing part will be 
read and checked. 

       InputStream decorate(final InputStream wrapped) throws IOException {
        return new GZIPInputStream(wrapped);
    }

In some cases, we don't really need to decompress it. For example, in 
"http://baike.baidu.com/search/word?word=httpclient&pic=1&sug=1&enc=utf8"; the 
response state code is 302, it contains header "Content-Encoding:gzip" but 
without any entity data (It occurs sometimes). In RedirectExec.execute(), we 
don't read the entity, but in the end, it try to close inputstream by 
EntityUtils.consume(response.getEntity()). When we call entity.getContent() in 
EntityUtils.consume(response.getEntity()), an EOFException will be thrown and 
the redirect can not continue. 

In this case, we don't care about the real entity -- even if the compress 
format is not right.

In my opinion, the format should be created and checked ONLY when we need to 
read the content but not just when closing it. So I wrote 
LazyDecompressingInputStream as a wrapper and create the DecompressingStream 
until read() method is called. Then more website will be supported.

     



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to