[ 
https://issues.apache.org/jira/browse/IVY-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875378#comment-13875378
 ] 

Loren Kratzke edited comment on IVY-1197 at 1/18/14 4:42 AM:
-------------------------------------------------------------

I have identified the root cause of the issue and I have a solution.

Problem: 
JVM experiences java.lang.OutOfMemoryError during ivy:publish of a file greater 
than a few hundred MB.

Cause: 
HttpURLConnection object as configured in 
org.apache.ivy.util.url.BasicURLHandler buffers the data before sending so as 
to calculate the content-length header value. This alone is bad however is 
likely aggravated by the default array resizing algorythm of the 
ByteArrayOutputStream used to buffer the data. 

The buffer starts small having a 32 byte capacity. When written to, it fills up 
and eventually it reaches maximum capacity. At this time a new buffer of twice 
the size is allocated. The content of the original buffer is copied to the new 
buffer and the old buffer is garbage collected. Lather, rinse, repeat.

This works fine for small amounts of data however this is a major problem when 
talking about medium or large amounts of data. As the buffer size approaches 
50% of the amount of free heap space, there will no longer be enough free RAM 
to allocate the new buffer. For example, if  a 512MB buffer requires even one 
more byte of capacity, and there is one byte less than 1024MB free, then an 
attempt by ByteArrayOutputStream to allocate a new 1024MB buffer will fail with 
an OOME. This is a tragic waste of memory, especially if the content-length is 
already known and buffering is not even required (which is always the case for 
Ivy).

Partial/Failed Solution Using HttpClient: 
For one reason or another, possibly to fix this issue, a reflection invocation 
is made by org.apache.ivy.util.url.URLHandlerRegistry for 
org.apache.commons.httpclient.HttpClient. Comments about Ivy-1197 instruct 
users to place HttpClient jar on the classpath (lib dir of Ant) to solve the 
OOME issue. Indeed, HttpClient (even the ancient version from 2005 used by Ivy) 
does not buffer the content and thus could avoid the OOME entirely, however 
there are three major problems with this solution.

The first problem is that the trial invocation of HttpClient issued to detect 
availablility on the classpath fails rather silently by logging a vague 
message, and only if Ant is invoked in verbose mode. So somebody might drop the 
jar into the lib, see nothing happen, and wonder why.

The second problem is that the trial invocation of HttpClient does indeed fail 
if all one does is drop the httpclient jar into the lib dir. This is because 
HttpClient requires two additional jars in order to instantiate: 
commons-logging and commons-codec. 

Adding these jars successfully triggers the Ivy code which substitutes the 
Apache HttpClient based org.apache.ivy.util.url.HttpClientHandler for the 
problematic URLConnectionHandler based 
org.apache.ivy.util.url.BasicURLConnectionHandler. But there is one more 
problem.

The third problem is as follows: Apache docs in the HttpClient performance 
guide describe how to stream a request using a custom RequestEntity object. 
This object is capable of restarting the stream in the event of an interruption 
or an authentication request. They provide sample code. This code was copied 
into the Ivy HttpClientHandler however the block that writes directly to the 
OutputStream (without buffering) has been replaced by a call to the same method 
that URLConnectionHandler calls which buffers all of the data. The net effect 
is that the OOME persists because nothing about the upload has changed - the 
data is still buffered in a ByteArrayOutputStream.

Complete/Successful Solution:
In HttpClientHandler.FileRequestEntity.writeRequest(OutputStream) replace this 
line:

    FileUtil.copy(instream, out, null, false);

with the original Apache sample code (slightly refactored to match the class):

    int length;
    byte[] buffer = new byte[64*1024];
    while ((length = instream.read(buffer)) != -1) {
      out.write(buffer, 0, length); 
    }
        
Then recompile Ivy. I recompiled using JDK 1.6 because that is the oldest VM 
around here.

This fix only works when all of the following libraries are located in 
$ANT_HOME/lib directory:

    commons-httpclient.jar
    commons-logging.jar
    commons-codec.jar

Those libraries are available in the apache-ivy-2.3.0-bin-with-deps 
distribution.


was (Author: qphase):
I have identified the root cause of the issue and I have a solution.

Problem: 
JVM experiences java.lang.OutOfMemoryError during ivy:publish of a file greater 
than a few hundred MB.

Cause: 
HttpURLConnection object as configured in 
org.apache.ivy.util.url.BasicURLHandler buffers the data before sending so as 
to calculate the content-length header value. This alone is bad however is 
likely aggravated by the default array resizing algorythm of the 
ByteArrayOutputStream used to buffer the data. 

The buffer starts small having a 32 byte capacity. When written to, it fills up 
and eventually it reaches maximum capacity. At this time a new buffer of twice 
the size is allocated. The content of the original buffer is copied to the new 
buffer and the old buffer is garbage collected. Lather, rinse, repeat.

This works fine for small amounts of data however this is a major problem when 
talking about medium or large amounts of data. As the buffer size approaches 
50% of the amount of free heap space, there will no longer be enough free RAM 
to allocate the new buffer. For example, if  a 512MB buffer requires even one 
more byte of capacity, and there is one byte less than 1024MB free, then an 
attempt by ByteArrayOutputStream to allocate a new 1024MB buffer will fail with 
an OOME. This is a tragic waste of memory, especially if the content-length is 
already known and buffering is not even required (which is always the case for 
Ivy).

Partial/Failed Solution Using HttpClient: 
For one reason or another, possibly to fix this issue, a reflection invocation 
is made by org.apache.ivy.util.url.URLHandlerRegistry for 
org.apache.commons.httpclient.HttpClient. Comments about Ivy-1197 instruct 
users to place HttpClient jar on the classpath (lib dir of Ant) to solve the 
OOME issue. Indeed, HttpClient (even the ancient version from 2005 used by Ivy) 
does not buffer the content and thus could avoid the OOME entirely, however 
there are three major problems with this solution.

The first problem is that the trial invocation of HttpClient issued to detect 
availablility on the classpath fails rather silently by logging a vague 
message, and only if Ant is invoked in verbose mode. So somebody might drop the 
jar into the lib, see nothing happen, and wonder why.

The second problem is that the trial invocation of HttpClient does indeed fail 
if all one does is drop the httpclient jar into the lib dir. This is because 
HttpClient requires two additional jars to in order to instantiate: 
commons-logging and commons-codec. Adding these jars successfully triggers the 
Ivy code which substitutes the Apache HttpClient based 
org.apache.ivy.util.url.HttpClientHandler for the problematic 
URLConnectionHandler based org.apache.ivy.util.url.BasicURLConnectionHandler. 
But there is one more problem.

The third problem is as follows: Apache docs in the HttpClient performance 
guide describe how to stream a request using a custom RequestEntity object. 
This object is capable of restarting the stream in the event of an interruption 
or an authentication request. They provide sample code. This code was copied 
into the Ivy HttpClientHandler however the block that writes directly to the 
OutputStream (without buffering) has been replaced by a call to the same method 
that URLConnectionHandler calls which buffers all of the data. The net effect 
is that the OOME persists because nothing about the upload has changed - the 
data is still buffered in a ByteArrayOutputStream.

Complete/Successful Solution:
In HttpClientHandler.FileRequestEntity.writeRequest(OutputStream) replace this 
line:

    FileUtil.copy(instream, out, null, false);

with the original Apache sample code (slightly refactored to match the class):

    int length;
    byte[] buffer = new byte[64*1024];
    while ((length = instream.read(buffer)) != -1) {
      out.write(buffer, 0, length); 
    }
        
Then recompile Ivy. I recompiled using JDK 1.6 because that is the oldest VM 
around here.

This fix only works when all of the following libraries are located in 
$ANT_HOME/lib directory:

    commons-httpclient.jar
    commons-logging.jar
    commons-codec.jar

Those libraries are available in the apache-ivy-2.3.0-bin-with-deps 
distribution.

> OutOfMemoryError duriong ivy:publish
> ------------------------------------
>
>                 Key: IVY-1197
>                 URL: https://issues.apache.org/jira/browse/IVY-1197
>             Project: Ivy
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.0
>            Reporter: Michael Rumpf
>         Attachments: ASF.LICENSE.NOT.GRANTED--clipboard.txt
>
>
> When publishing a large file, an OutOfMemoryError occurs.
> {code}
> [ivy:publish]         published ppg to 
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> BUILD FAILED
> /export/build/hudson/jobs/ppg-rcp/workspace/ppg-rcp/com.daimler.ppg.rcp.builder/build-wrapper.xml:152:
>  The following error occurred while executing this line:
> /export/build/hudson/jobs/ppg-rcp/workspace/ppg-rcp/com.daimler.ppg.rcp.builder/build-wrapper.xml:277:
>  java.lang.OutOfMemoryError: Java heap space
>       at java.util.Arrays.copyOf(Arrays.java:2786)
>       at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
>       at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
>       at org.apache.ivy.util.FileUtil.copy(FileUtil.java:168)
>       at 
> org.apache.ivy.util.url.BasicURLHandler.upload(BasicURLHandler.java:200)
>       at 
> org.apache.ivy.util.url.URLHandlerDispatcher.upload(URLHandlerDispatcher.java:82)
>       at org.apache.ivy.util.FileUtil.copy(FileUtil.java:140)
>       at 
> org.apache.ivy.plugins.repository.url.URLRepository.put(URLRepository.java:85)
>       at 
> org.apache.ivy.plugins.repository.AbstractRepository.put(AbstractRepository.java:130)
>       at 
> org.apache.ivy.plugins.resolver.RepositoryResolver.put(RepositoryResolver.java:219)
>       at 
> org.apache.ivy.plugins.resolver.RepositoryResolver.publish(RepositoryResolver.java:209)
>       at 
> org.apache.ivy.core.publish.PublishEngine.publish(PublishEngine.java:282)
>       at 
> org.apache.ivy.core.publish.PublishEngine.publish(PublishEngine.java:261)
>       at 
> org.apache.ivy.core.publish.PublishEngine.publish(PublishEngine.java:170)
>       at org.apache.ivy.Ivy.publish(Ivy.java:600)
>       at org.apache.ivy.ant.IvyPublish.doExecute(IvyPublish.java:299)
>       at org.apache.ivy.ant.IvyTask.execute(IvyTask.java:277)
>       at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
>       at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
>       at org.apache.tools.ant.Task.perform(Task.java:348)
>       at org.apache.tools.ant.Target.execute(Target.java:390)
>       at org.apache.tools.ant.Target.performTasks(Target.java:411)
>       at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1397)
>       at 
> org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
>       at org.apache.tools.ant.Project.executeTargets(Project.java:1249)
>       at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442)
>       at org.apache.tools.ant.taskdefs.CallTarget.execute(CallTarget.java:105)
>       at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> Total time: 14 minutes 24 seconds
> Finished: FAILURE
> {code}
> The size of the file that is being uploaded is: 687712714, so around 
> 650-700MB.
> The publish task is part of a Hudson Ant build where the artefacts are 
> published to an Artifactory repository at the end.
> I have given the Job 1300MB for the max heap size.
> It seems as if the whole file is loaded into memory for the upload.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to