Dear all, We frequently meet a heap out of space problem when running topology using KafkaSpout. Please kindly help.
Our scenario is that we send large files to Kafka, each file is about 3MB size. We use Storm to consume messages from Kafka (using KafkaSpout), and we process the message line by line and emit messages. We find that very frequently there are memory problems as shown below: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2271) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) at java.io.ByteArrayOutputStream.ensur java.lang.OutOfMemoryError: Java heap space at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:296) at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read( java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.Ab Our settings of Storm are: drpc.childopts -Xmx768m supervisor.childopts -Xmx256m worker.childopts -Xmx1536m -Xms1024m -XX:MaxPermSize=128m -XX:NewSize=512m -XX:MaxNewSize=1024m Each node of our cluster is 4CPU, 8GB memory, and we configured 4 workers a node We dumped the memory and analyzed that there is an LinkedList object holding lots of memory, and we found that it's used by KafkaSpout. The List is used to hold all the messages, if we understand correctly, KafkaSpout will fetch all the messages from current consumed offset to the max offset and store the messages in the list. Because kafka producer is very fast, and in Storm we process line by line which is not consuming fast enough, the list gets bigger and bigger. So my questions are these: 1) If our analysis is correct, how to limit the size of messages that KafkaSpout fetch every time, for example, make it not fetch from current offset to the latest messages. Or to say, fetch a fixed number of messages, for instance. 2) If our analysis is not correct, could you give a suggestion where the problems are? Also are the memory settings correct? Thank you very much for your help! Regards, Sai
