I have streaming app (pyspark 1.5.2 on yarn) that's crashing due to driver (jvm
part, not python) OOM (no matter how big heap is assigned, eventually runs out).
When checking the heap it is all taken by "byte" items of
io.netty.buffer.PoolThreadCache. The number of
io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry is constant yet the
number of [B "bytes" keeps growing as well as the number of Finalizer
instances. When checking the Finalizer instances it is all of
ZipFile$ZipFileInputStream and ZipFile$ZipFileInflaterInputStream
num #instances #bytes class
name---------------------------------------------- 1: 123556
278723776 [B 2: 258988 10359520 java.lang.ref.Finalizer 3:
174620 9778720 java.util.zip.Deflater 4: 66684
7468608 org.apache.spark.executor.TaskMetrics 5: 80070
7160112 [C 6: 282624 6782976
io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry 7: 206371
4952904 java.lang.Long
the platform is using netty 3.6.6 and openjdk 1.8 (tried on 1.7 as well with
same issue).
would anyone have a clue how to troubleshoot further?
thx.