[ 
https://issues.apache.org/jira/browse/HDDS-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765939#comment-16765939
 ] 

Supratim Deka commented on HDDS-1082:
-------------------------------------

KeyInputStream holds on to 1 chunk worth of ByteString buffers(in heap) for 
every BlockInputStream.

corresponding to each 256MB block which is read, 16MB of heap buffers gets 
retained until the entire length in the KeyInputStream is served. 
After reading about 50GB of data from a key, heap memory usage crosses 3GB.

Possible solution is to release the buffers once the read has crossed the Block 
boundary. also need to ensure that a subsequent seek continues to work 
correctly. Working out the specifics.



> OutOfMemoryError while reading key of size 100GB
> ------------------------------------------------
>
>                 Key: HDDS-1082
>                 URL: https://issues.apache.org/jira/browse/HDDS-1082
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Nilotpal Nandi
>            Assignee: Supratim Deka
>            Priority: Blocker
>             Fix For: 0.4.0
>
>
> steps taken :
> --------------------
>  # put key with size 100GB
>  # Tried to read back the key.
> error thrown:
> ------------------------------
> {noformat}
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to /tmp/heapdump.bin ...
> Heap dump file created [3883178021 bytes in 10.667 secs]
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at 
> org.apache.ratis.thirdparty.com.google.protobuf.ByteString.toByteArray(ByteString.java:643)
>  at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:217)
>  at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.readChunkFromContainer(BlockInputStream.java:227)
>  at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.prepareRead(BlockInputStream.java:188)
>  at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:130)
>  at 
> org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.read(KeyInputStream.java:232)
>  at 
> org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:126)
>  at 
> org.apache.hadoop.ozone.client.io.OzoneInputStream.read(OzoneInputStream.java:49)
>  at java.io.InputStream.read(InputStream.java:101)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:100)
>  at 
> org.apache.hadoop.ozone.web.ozShell.keys.GetKeyHandler.call(GetKeyHandler.java:98)
>  at 
> org.apache.hadoop.ozone.web.ozShell.keys.GetKeyHandler.call(GetKeyHandler.java:48)
>  at picocli.CommandLine.execute(CommandLine.java:919)
>  at picocli.CommandLine.access$700(CommandLine.java:104)
>  at picocli.CommandLine$RunLast.handle(CommandLine.java:1083)
>  at picocli.CommandLine$RunLast.handle(CommandLine.java:1051)
>  at 
> picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959)
>  at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242)
>  at picocli.CommandLine.parseWithHandler(CommandLine.java:1181)
>  at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61)
>  at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52)
>  at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:83){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to