Aklakan opened a new pull request #186:
URL: https://github.com/apache/commons-vfs/pull/186


   [Link to VFS-805 on Jira](https://issues.apache.org/jira/browse/VFS-805)
   
   This PR affects the http4 and http5 URI schemes. The http3 scheme is 
unaffected because it does not use the 
`MonitoredHttpResponseContentInputStream` abstraction.
   
   With this PR, closing a `MonitoredHttpResponseContentInputStream` no longer 
closes the underlying stream which may download *all* remaining data. The 
HttpResponse itself is still closed of course.
   
   The `http5.MonitoredHttpResponseContentInputStream` variant in addition 
required replacing the http entity with a dummy one during close in order to 
prevent downloading all remaining data: In contrast to hc4, closing the http 
response with hc5 now also closes the entity. Interestingly, replacing the 
entity is possible and does not close an existing one. Not sure about how 
stable this behavior is, but I couldn't find any reasonable alternative. For 
example, `CloseableHttpResponse` has a package private constructor, so it's not 
possible to create a custom http response interceptor that creates an instance 
of a custom subclass of `CloseableHttpResponse` that doesn't close the entity.
   
   Also, I am not sure about how to best provide a test case:
   * While I see that the unit tests spawn an http server, I am not sure how 
much effort it would be in order for that server to *track statistics* about 
how much content was retrieved.
   * HttpClient 3/4/5 all return that ContentLengthInputStream if the HTTP 
header includes a `Content-Length` entry - this part is pretty much hard coded 
in all version of hc. So there doesn't seem to be a simple way to inject a 
custom InputStream implementation that tracks how much data is being read.
   
   Locally, I tested this change with the benchmark described in the Jira issue 
and let it perform 100k iterations of seeking over a 2gb file in order to 
potentially reveal connection leaks. For me its working.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to