I searched code briefly.

The following uses ZipEntry, ZipOutputStream :

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
core/src/main/scala/org/apache/spark/deploy/RPackageUtils.scala

FYI

On Tue, Dec 22, 2015 at 9:16 AM, Antony Mayi <antonym...@yahoo.com.invalid>
wrote:

> I narrowed it down to problem described for example here:
> https://bugs.openjdk.java.net/browse/JDK-6293787
>
> It is the mass finalization of zip Inflater/Deflater objects which can't
> keep up with the rate of these instances being garbage collected. as the
> jdk bug report (not being accepted as a bug) suggests this is an error of
> suboptimal destruction of the instances.
>
> Not sure where the zip comes from - for all the compressors used in spark
> I was using the default snappy codec.
>
> I am trying to disable all the spark.*.compress options and so far it
> seems this has dramatically improved, the finalization looks to be keeping
> up and the heap is stable.
>
> Any input is still welcome!
>
>
> On Tuesday, 22 December 2015, 12:17, Ted Yu <yuzhih...@gmail.com> wrote:
>
>
>
> This might be related but the jmap output there looks different:
>
>
> http://stackoverflow.com/questions/32537965/huge-number-of-io-netty-buffer-poolthreadcachememoryregioncacheentry-instances
>
> On Tue, Dec 22, 2015 at 2:59 AM, Antony Mayi <antonym...@yahoo.com.invalid
> > wrote:
>
> I have streaming app (pyspark 1.5.2 on yarn) that's crashing due to driver
> (jvm part, not python) OOM (no matter how big heap is assigned, eventually
> runs out).
>
> When checking the heap it is all taken by "byte" items of
> io.netty.buffer.PoolThreadCache. The number of
> io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry is constant yet the
> number of [B "bytes" keeps growing as well as the number of Finalizer
> instances. When checking the Finalizer instances it is all of
> ZipFile$ZipFileInputStream and ZipFile$ZipFileInflaterInputStream
>
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:        123556      278723776  [B
>    2:        258988       10359520  java.lang.ref.Finalizer
>    3:        174620        9778720  java.util.zip.Deflater
>    4:         66684        7468608  org.apache.spark.executor.TaskMetrics
>    5:         80070        7160112  [C
>    6:        282624        6782976
>  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
>    7:        206371        4952904  java.lang.Long
>
> the platform is using netty 3.6.6 and openjdk 1.8 (tried on 1.7 as well
> with same issue).
>
> would anyone have a clue how to troubleshoot further?
>
> thx.
>
>
>
>
>

Reply via email to