Did you recently add topics / partitions? Each partitions takes a memory buffer for replication, so you sometimes get OOME by adding partitions without sizing memory.
You basically need the Java heapsize to be larger than # partitions on the broker X replica.fetch.size Gwen On Wed, Dec 14, 2016 at 12:03 PM, Zakee <kzak...@netzero.net> wrote: > Recently, we have seen our brokers crash with below errors, any idea what > might be wrong here? The brokers have been running for long with the same > hosts/configs without this issue before. Is this something to do with new > version 0.10.0.1 (which we upgraded recently) or could it be a h/w issue? > 10 hosts are dedicated for one broker per host. Each host has 128 gb RAM > and 20TB of storage mounts. Any pointers will help... > > > [2016-12-12 02:49:58,134] FATAL [app=broker] [ReplicaFetcherThread-15-15] > [ReplicaFetcherThread-15-15], Disk error while replicating data for > mytopic-19 (kafka.server.ReplicaFetcherThread) > kafka.common.KafkaStorageException: I/O exception in append to log ’ > mytopic-19' > at kafka.log.Log.append(Log.scala:349) > at kafka.server.ReplicaFetcherThread.processPartitionData( > ReplicaFetcherThread.scala:130) > at kafka.server.ReplicaFetcherThread.processPartitionData( > ReplicaFetcherThread.scala:42) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$ > anonfun$apply$2.apply(AbstractFetcherThread.scala:159) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$ > anonfun$apply$2.apply(AbstractFetcherThread.scala:141) > at scala.Option.foreach(Option.scala:257) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply( > AbstractFetcherThread.scala:141) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply( > AbstractFetcherThread.scala:138) > at scala.collection.mutable.ResizableArray$class.foreach( > ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach( > ArrayBuffer.scala:48) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala: > 138) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:138) > at kafka.server.AbstractFetcherThread$$ > anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:138) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234) > at kafka.server.AbstractFetcherThread.processFetchRequest( > AbstractFetcherThread.scala:136) > at kafka.server.AbstractFetcherThread.doWork( > AbstractFetcherThread.scala:103) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > Caused by: java.io.IOException: Map failed > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:907) > at kafka.log.AbstractIndex$$anonfun$resize$1.apply( > AbstractIndex.scala:116) > at kafka.log.AbstractIndex$$anonfun$resize$1.apply( > AbstractIndex.scala:106) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234) > at kafka.log.AbstractIndex.resize(AbstractIndex.scala:106) > at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1. > apply$mcV$sp(AbstractIndex.scala:160) > at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1. > apply(AbstractIndex.scala:160) > at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1. > apply(AbstractIndex.scala:160) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234) > at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex. > scala:159) > at kafka.log.Log.roll(Log.scala:772) > at kafka.log.Log.maybeRoll(Log.scala:742) > at kafka.log.Log.append(Log.scala:405) > ... 16 more > Caused by: java.lang.OutOfMemoryError: Map failed > at sun.nio.ch.FileChannelImpl.map0(Native Method) > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:904) > ... 28 more > > > Thanks > -Zakee -- *Gwen Shapira* Product Manager | Confluent 650.450.2760 | @gwenshap Follow us: Twitter <https://twitter.com/ConfluentInc> | blog <http://www.confluent.io/blog>