[
https://issues.apache.org/jira/browse/VFS-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17353634#comment-17353634
]
Claus Stadler edited comment on VFS-805 at 5/28/21, 11:37 PM:
--------------------------------------------------------------
Thanks for the hint with http4/5!
I can see in the debugger that now HC4/5 is used, however unfortunately the
problem still persists.
The reason seems to be that HC (at least in VFS2's default configuration)
wants to keep the connection open for further requests and therefore
unconditionally prefers to consume even arbitrarily large responses (this is
what ContentLengthInputStream does) rather than closing the connection.
The stack traces to ContentLengthInputStream's contructor in HC are based on
the "r.readFully(bytes);" statement in the example above.
Not sure whether HC can be reconfigured to avoid running into this issue :/
*HC5*
{code:java}
ContentLengthInputStream.<init>(SessionInputBuffer, InputStream, long) line: 82
DefaultManagedHttpClientConnection(BHttpConnectionBase).createContentInputStream(long,
SessionInputBuffer, InputStream) line: 171
...
HttpRequestExecutor.execute(ClassicHttpRequest, HttpClientConnection,
HttpResponseInformationCallback, HttpContext) line: 192
{code}
*HC4*
{code:java}
ContentLengthInputStream.<init>(SessionInputBuffer, long) line: 83
LoggingManagedHttpClientConnection(BHttpConnectionBase).createInputStream(long,
SessionInputBuffer) line: 213
...
CPoolProxy.receiveResponseEntity(HttpResponse) line: 162
HttpRequestExecutor.doReceiveResponse(HttpRequest, HttpClientConnection,
HttpContext) line: 279
{code}
was (Author: aklakan):
Thanks for the hint with http4/5!
I can see in the debugger that now HC4/5 is used, however unfortunately the
problem still persists.
The reason seems to be that HC (at least in VFS2's default configuration) wants
to keep the connection open for further requests and therefore unconditionally
prefers to consume even arbitrarily large responses (this is what
ContentLengthInputStream does) rather than closing the connection.
The stack traces to ContentLengthInputStream's contructor in HC is based on the
"r.readFully(bytes);" statement in the example above.
Not sure whether HC can be reconfigured to avoid running into this issue :/
*HC5*
{code}
ContentLengthInputStream.<init>(SessionInputBuffer, InputStream, long) line: 82
DefaultManagedHttpClientConnection(BHttpConnectionBase).createContentInputStream(long,
SessionInputBuffer, InputStream) line: 171
...
HttpRequestExecutor.execute(ClassicHttpRequest, HttpClientConnection,
HttpResponseInformationCallback, HttpContext) line: 192
{code}
*HC4*
{code}
ContentLengthInputStream.<init>(SessionInputBuffer, long) line: 83
LoggingManagedHttpClientConnection(BHttpConnectionBase).createInputStream(long,
SessionInputBuffer) line: 213
...
CPoolProxy.receiveResponseEntity(HttpResponse) line: 162
HttpRequestExecutor.doReceiveResponse(HttpRequest, HttpClientConnection,
HttpContext) line: 279
{code}
> HTTP seek always exhausts response
> ----------------------------------
>
> Key: VFS-805
> URL: https://issues.apache.org/jira/browse/VFS-805
> Project: Commons VFS
> Issue Type: Bug
> Affects Versions: 2.8.0
> Reporter: Claus Stadler
> Priority: Major
>
> Seeking on an HTTP resource always downloads ALL content if a Content-Length
> header is present. The problem is that seeking closes the current input
> stream which eventually ends up in ContentLengthInputStream.close() of the
> (ancient) http client library.
>
> To be clear, the problem is actually not with the seek itself, but with the
> underlying close implementation that always exhausts the HTTP response body.
> See the example below.
>
> My use case is to perform binary search on sorted datasets on the Web (RDF
> data in sorted ntriple syntax) - the binary search works locally and *in
> principle* works on HTTP resources abstracted with VFS2, but the seek
> implementation that downloads *ALL* data (in my case several GBs)
> unfortunately defeats the purpose :(
>
> From org.apache.commons.httpclient.ContentLengthInputStream
> (commons-httpclient-3.1):
> {code:java}
> public void close() throws IOException {
> if (!closed) {
> try {
> ChunkedInputStream.exhaustInputStream(this);
> } finally {
> // close after above so that we don't throw an exception
> trying
> // to read after closed!
> closed = true;
> }
> }
> }
> {code}
> Example:
> {code:java}
> public static void main(String[] args) throws Exception {
> String url = "http://localhost/large-file-2gb.txt";
> FileSystemManager fsManager = VFS.getManager();
>
> try (FileObject file = fsManager.resolveFile(url)) {
> try (RandomAccessContent r =
> file.getContent().getRandomAccessContent(RandomAccessMode.READ)) {
>
> StopWatch sw1 = StopWatch.createStarted();
> r.seek(20);
> System.out.println("Initial seek: " +
> sw1.getTime(TimeUnit.MILLISECONDS));
> StopWatch sw2 = StopWatch.createStarted();
> byte[] bytes = new byte[100];
> r.readFully(bytes);
> System.out.println("Read: " +
> sw2.getTime(TimeUnit.MILLISECONDS));
>
> StopWatch sw3 = StopWatch.createStarted();
> r.seek(100);
> System.out.println("Subsequent seek: " +
> sw3.getTime(TimeUnit.MILLISECONDS));
> }
> }
> System.out.println("Done");
> }
> {code}
> Output (times in milliseconds):
> {code}
> Initial seek: 0
> Read: 4
> Subsequent seek: 2538
> Done
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)