[ https://issues.apache.org/jira/browse/HADOOP-16854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127506#comment-17127506 ]
Karthik Amarnath edited comment on HADOOP-16854 at 7/27/20, 4:26 PM: --------------------------------------------------------------------- Do see the issue with AbfsOutputStream running out of memory while trying to DistCP data to ADSL Gen2 in the AzureEU region. {code:java} 2020-06-07 04:39:58,878 ERROR [main] org.apache.gobblin.runtime.fork.Fork-0: Fork 0 of task task_FileDistcpAzurePush_1591504534904_2 failed to process data records. Set throwable in holder org.apache.gobblin.runtime.ForkThrowableHolder@1ec36c52 java.io.IOException: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 1 attempts. at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:144) at org.apache.gobblin.writer.RetryWriter.writeEnvelope(RetryWriter.java:124) at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:513) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86) at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:251) at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 1 attempts. at com.github.rholder.retry.Retryer.call(Retryer.java:174) at com.github.rholder.retry.Retryer$RetryerCallable.call(Retryer.java:318) at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:142) ... 11 more Caused by: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at org.apache.hadoop.io.ElasticByteBufferPool.getBuffer(ElasticByteBufferPool.java:96) at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.writeCurrentBufferToService(AbfsOutputStream.java:285) at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.flushInternal(AbfsOutputStream.java:268) at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.close(AbfsOutputStream.java:247) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:283) at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:186) at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:83) at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterBase.write(InstrumentedDataWriterBase.java:158) at org.apache.gobblin.instrumented.writer.InstrumentedDataWriter.write(InstrumentedDataWriter.java:38) at org.apache.gobblin.writer.DataWriter.writeEnvelope(DataWriter.java:106) at org.apache.gobblin.writer.CloseOnFlushWriterWrapper.writeEnvelope(CloseOnFlushWriterWrapper.java:97) at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterDecorator.writeEnvelope(InstrumentedDataWriterDecorator.java:76) at org.apache.gobblin.writer.PartitionedDataWriter.writeEnvelope(PartitionedDataWriter.java:176) at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:119) at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:116) at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78) at com.github.rholder.retry.Retryer.call(Retryer.java:160) at com.github.rholder.retry.Retryer$RetryerCallable.call(Retryer.java:318) at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:142) at org.apache.gobblin.writer.RetryWriter.writeEnvelope(RetryWriter.java:124) at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:513) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86) at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:251) at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) {code} was (Author: amarnathkarthik): Do see the issue with AbfsOutputStream running out of memory while trying to DistCP data to ADSL Gen1 in the AzureEU region. {code:java} 2020-06-07 04:39:58,878 ERROR [main] org.apache.gobblin.runtime.fork.Fork-0: Fork 0 of task task_FileDistcpAzurePush_1591504534904_2 failed to process data records. Set throwable in holder org.apache.gobblin.runtime.ForkThrowableHolder@1ec36c52 java.io.IOException: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 1 attempts. at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:144) at org.apache.gobblin.writer.RetryWriter.writeEnvelope(RetryWriter.java:124) at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:513) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86) at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:251) at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 1 attempts. at com.github.rholder.retry.Retryer.call(Retryer.java:174) at com.github.rholder.retry.Retryer$RetryerCallable.call(Retryer.java:318) at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:142) ... 11 more Caused by: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at org.apache.hadoop.io.ElasticByteBufferPool.getBuffer(ElasticByteBufferPool.java:96) at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.writeCurrentBufferToService(AbfsOutputStream.java:285) at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.flushInternal(AbfsOutputStream.java:268) at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.close(AbfsOutputStream.java:247) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:283) at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:186) at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:83) at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterBase.write(InstrumentedDataWriterBase.java:158) at org.apache.gobblin.instrumented.writer.InstrumentedDataWriter.write(InstrumentedDataWriter.java:38) at org.apache.gobblin.writer.DataWriter.writeEnvelope(DataWriter.java:106) at org.apache.gobblin.writer.CloseOnFlushWriterWrapper.writeEnvelope(CloseOnFlushWriterWrapper.java:97) at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterDecorator.writeEnvelope(InstrumentedDataWriterDecorator.java:76) at org.apache.gobblin.writer.PartitionedDataWriter.writeEnvelope(PartitionedDataWriter.java:176) at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:119) at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:116) at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78) at com.github.rholder.retry.Retryer.call(Retryer.java:160) at com.github.rholder.retry.Retryer$RetryerCallable.call(Retryer.java:318) at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:142) at org.apache.gobblin.writer.RetryWriter.writeEnvelope(RetryWriter.java:124) at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:513) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103) at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86) at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:251) at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) {code} > ABFS: Fix for OutofMemoryException from AbfsOutputStream > -------------------------------------------------------- > > Key: HADOOP-16854 > URL: https://issues.apache.org/jira/browse/HADOOP-16854 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.4.0 > Reporter: Sneha Vijayarajan > Assignee: Bilahari T H > Priority: Major > Labels: abfsactive > Fix For: 3.4.0 > > Attachments: AbfsOutputStream Improvements.pdf > > > Currently in environments where memory is restricted, current max concurrent > request count logic will trigger a large number of buffers needed for the > execution to be blocked leading to out Of Memory exceptions. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org