maybe try reducing spark.executor.cores perhaps your tasks have large offheap overhead and better have less tasks running in parallel is it streaming job?
On Tue, Aug 4, 2015 at 2:14 PM Igor Berman <igor.ber...@gmail.com> wrote: > sorry, can't disclose info about my prod cluster > > nothing jumps into my mind regarding your config > we don't use lz4 compression, don't know what is spark.deploy.spreadOut(there > is no documentation regarding this) > > If you are sure that you don't have memory leak in your business logic I > would try to reset each property to default(or just remove it from your > config) and try to run your job to see if it's not > somehow connected > > my config(nothing special really) > spark.shuffle.consolidateFiles true > spark.speculation false > spark.executor.extraJavaOptions -XX:+UseStringCache > -XX:+UseCompressedStrings -XX:+PrintGC -XX:+PrintGCDetails > -XX:+PrintGCTimeStamps -Xloggc:gc.log -verbose:gc > spark.executor.logs.rolling.maxRetainedFiles 1000 > spark.executor.logs.rolling.strategy time > spark.worker.cleanup.enabled true > spark.logConf true > spark.rdd.compress true > > > > > > On 4 August 2015 at 12:59, Sea <261810...@qq.com> wrote: > >> How much machines are there in your standalone cluster? >> I am not using tachyon. >> >> GC can not help me... Can anyone help ? >> >> my configuration: >> >> spark.deploy.spreadOut false >> spark.eventLog.enabled true >> spark.executor.cores 24 >> >> spark.ui.retainedJobs 10 >> spark.ui.retainedStages 10 >> spark.history.retainedApplications 5 >> spark.deploy.retainedApplications 10 >> spark.deploy.retainedDrivers 10 >> spark.streaming.ui.retainedBatches 10 >> spark.sql.thriftserver.ui.retainedSessions 10 >> spark.sql.thriftserver.ui.retainedStatements 100 >> >> spark.file.transferTo false >> spark.driver.maxResultSize 4g >> spark.sql.hive.metastore.jars=/spark/spark-1.4.1/hive/* >> >> spark.eventLog.dir hdfs://mycluster/user/spark/historylog >> spark.history.fs.logDirectory hdfs://mycluster/user/spark/historylog >> >> spark.driver.extraClassPath=/spark/spark-1.4.1/extlib/* >> spark.executor.extraClassPath=/spark/spark-1.4.1/extlib/* >> >> spark.sql.parquet.binaryAsString true >> spark.serializer org.apache.spark.serializer.KryoSerializer >> spark.kryoserializer.buffer 32 >> spark.kryoserializer.buffer.max 256 >> spark.shuffle.consolidateFiles true >> spark.io.compression.codec org.apache.spark.io.LZ4CompressionCodec >> >> >> >> >> >> ------------------ 原始邮件 ------------------ >> *发件人:* "Igor Berman";<igor.ber...@gmail.com>; >> *发送时间:* 2015年8月3日(星期一) 晚上7:56 >> *收件人:* "Sea"<261810...@qq.com>; >> *抄送:* "Barak Gitsis"<bar...@similarweb.com>; "Ted Yu"<yuzhih...@gmail.com>; >> "user@spark.apache.org"<user@spark.apache.org>; "rxin"< >> r...@databricks.com>; "joshrosen"<joshro...@databricks.com>; "davies"< >> dav...@databricks.com>; >> *主题:* Re: About memory leak in spark 1.4.1 >> >> in general, what is your configuration? use --conf "spark.logConf=true" >> >> we have 1.4.1 in production standalone cluster and haven't experienced >> what you are describing >> can you verify in web-ui that indeed spark got your 50g per executor >> limit? I mean in configuration page.. >> >> might be you are using offheap storage(Tachyon)? >> >> >> On 3 August 2015 at 04:58, Sea <261810...@qq.com> wrote: >> >>> "spark uses a lot more than heap memory, it is the expected behavior." >>> It didn't exist in spark 1.3.x >>> What does "a lot more than" means? It means that I lose control of it! >>> I try to apply 31g, but it still grows to 55g and continues to grow!!! >>> That is the point! >>> I have tried set memoryFraction to 0.2,but it didn't help. >>> I don't know whether it will still exist in the next release 1.5, I wish >>> not. >>> >>> >>> >>> ------------------ 原始邮件 ------------------ >>> *发件人:* "Barak Gitsis";<bar...@similarweb.com>; >>> *发送时间:* 2015年8月2日(星期天) 晚上9:55 >>> *收件人:* "Sea"<261810...@qq.com>; "Ted Yu"<yuzhih...@gmail.com>; >>> *抄送:* "user@spark.apache.org"<user@spark.apache.org>; "rxin"< >>> r...@databricks.com>; "joshrosen"<joshro...@databricks.com>; "davies"< >>> dav...@databricks.com>; >>> *主题:* Re: About memory leak in spark 1.4.1 >>> >>> spark uses a lot more than heap memory, it is the expected behavior. >>> in 1.4 off-heap memory usage is supposed to grow in comparison to 1.3 >>> >>> Better use as little memory as you can for heap, and since you are not >>> utilizing it already, it is safe for you to reduce it. >>> memoryFraction helps you optimize heap usage for your data/application >>> profile while keeping it tight. >>> >>> >>> >>> >>> >>> >>> On Sun, Aug 2, 2015 at 12:54 PM Sea <261810...@qq.com> wrote: >>> >>>> spark.storage.memoryFraction is in heap memory, but my situation is >>>> that the memory is more than heap memory ! >>>> >>>> Anyone else use spark 1.4.1 in production? >>>> >>>> >>>> ------------------ 原始邮件 ------------------ >>>> *发件人:* "Ted Yu";<yuzhih...@gmail.com>; >>>> *发送时间:* 2015年8月2日(星期天) 下午5:45 >>>> *收件人:* "Sea"<261810...@qq.com>; >>>> *抄送:* "Barak Gitsis"<bar...@similarweb.com>; "user@spark.apache.org"< >>>> user@spark.apache.org>; "rxin"<r...@databricks.com>; "joshrosen"< >>>> joshro...@databricks.com>; "davies"<dav...@databricks.com>; >>>> *主题:* Re: About memory leak in spark 1.4.1 >>>> >>>> http://spark.apache.org/docs/latest/tuning.html does mention >>>> spark.storage.memoryFraction >>>> in two places. >>>> One is under Cache Size Tuning section. >>>> >>>> FYI >>>> >>>> On Sun, Aug 2, 2015 at 2:16 AM, Sea <261810...@qq.com> wrote: >>>> >>>>> Hi, Barak >>>>> It is ok with spark 1.3.0, the problem is with spark 1.4.1. >>>>> I don't think spark.storage.memoryFraction will make any sense, >>>>> because it is still in heap memory. >>>>> >>>>> >>>>> ------------------ 原始邮件 ------------------ >>>>> *发件人:* "Barak Gitsis";<bar...@similarweb.com>; >>>>> *发送时间:* 2015年8月2日(星期天) 下午4:11 >>>>> *收件人:* "Sea"<261810...@qq.com>; "user"<user@spark.apache.org>; >>>>> *抄送:* "rxin"<r...@databricks.com>; "joshrosen"< >>>>> joshro...@databricks.com>; "davies"<dav...@databricks.com>; >>>>> *主题:* Re: About memory leak in spark 1.4.1 >>>>> >>>>> Hi, >>>>> reducing spark.storage.memoryFraction did the trick for me. Heap >>>>> doesn't get filled because it is reserved.. >>>>> My reasoning is: >>>>> I give executor all the memory i can give it, so that makes it a >>>>> boundary. >>>>> From here i try to make the best use of memory I can. >>>>> storage.memoryFraction is in a sense user data space. The rest can be >>>>> used >>>>> by the system. >>>>> If you don't have so much data that you MUST store in memory for >>>>> performance, better give spark more space.. >>>>> ended up setting it to 0.3 >>>>> >>>>> All that said, it is on spark 1.3 on cluster >>>>> >>>>> hope that helps >>>>> >>>>> On Sat, Aug 1, 2015 at 5:43 PM Sea <261810...@qq.com> wrote: >>>>> >>>>>> Hi, all >>>>>> I upgrage spark to 1.4.1, many applications failed... I find the heap >>>>>> memory is not full , but the process of CoarseGrainedExecutorBackend will >>>>>> take more memory than I expect, and it will increase as time goes on, >>>>>> finally more than max limited of the server, the worker will die..... >>>>>> >>>>>> Any can help? >>>>>> >>>>>> Mode:standalone >>>>>> >>>>>> spark.executor.memory 50g >>>>>> >>>>>> 25583 xiaoju 20 0 75.5g 55g 28m S 1729.3 88.1 2172:52 java >>>>>> >>>>>> 55g more than 50g I apply >>>>>> >>>>>> -- >>>>> *-Barak* >>>>> >>>> >>>> -- >>> *-Barak* >>> >> >> > -- *-Barak*