Is this still Mesos fine grained mode?

On Wed, Oct 21, 2015 at 1:16 PM, Jerry Lam <chiling...@gmail.com> wrote:

> Hi guys,
>
> There is another memory issue. Not sure if this is related to Tungsten
> this time because I have it disable (spark.sql.tungsten.enabled=false). It
> happens more there are too many tasks running (300). I need to limit the
> number of task to avoid this. The executor has 6G. Spark 1.5.1 is been used.
>
> Best Regards,
>
> Jerry
>
> org.apache.spark.SparkException: Task failed while writing rows.
>       at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:393)
>       at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
>       at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>       at org.apache.spark.scheduler.Task.run(Task.scala:88)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Unable to acquire 67108864 bytes of memory
>       at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPage(UnsafeExternalSorter.java:351)
>       at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.<init>(UnsafeExternalSorter.java:138)
>       at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.create(UnsafeExternalSorter.java:106)
>       at 
> org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:74)
>       at 
> org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:56)
>       at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:339)
>
>
> On Tue, Oct 20, 2015 at 9:10 PM, Reynold Xin <r...@databricks.com> wrote:
>
>> With Jerry's permission, sending this back to the dev list to close the
>> loop.
>>
>>
>> ---------- Forwarded message ----------
>> From: Jerry Lam <chiling...@gmail.com>
>> Date: Tue, Oct 20, 2015 at 3:54 PM
>> Subject: Re: If you use Spark 1.5 and disabled Tungsten mode ...
>> To: Reynold Xin <r...@databricks.com>
>>
>>
>> Yup, coarse grained mode works just fine. :)
>> The difference is that by default, coarse grained mode uses 1 core per
>> task. If I constraint 20 cores in total, there can be only 20 tasks running
>> at the same time. However, with fine grained, I cannot set the total number
>> of cores and therefore, it could be +200 tasks running at the same time (It
>> is dynamic). So it might be the calculation of how much memory to acquire
>> fail when the number of cores cannot be known ahead of time because you
>> cannot make the assumption that X tasks running in an executor? Just my
>> guess...
>>
>>
>> On Tue, Oct 20, 2015 at 6:24 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>>> Can you try coarse-grained mode and see if it is the same?
>>>
>>>
>>> On Tue, Oct 20, 2015 at 3:20 PM, Jerry Lam <chiling...@gmail.com> wrote:
>>>
>>>> Hi Reynold,
>>>>
>>>> Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers but
>>>> sometimes it does not. For one particular job, it failed all the time with
>>>> the acquire-memory issue. I'm using spark on mesos with fine grained mode.
>>>> Does it make a difference?
>>>>
>>>> Best Regards,
>>>>
>>>> Jerry
>>>>
>>>> On Tue, Oct 20, 2015 at 5:27 PM, Reynold Xin <r...@databricks.com>
>>>> wrote:
>>>>
>>>>> Jerry - I think that's been fixed in 1.5.1. Do you still see it?
>>>>>
>>>>> On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam <chiling...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I disabled it because of the "Could not acquire 65536 bytes of
>>>>>> memory". It happens to fail the job. So for now, I'm not touching it.
>>>>>>
>>>>>> On Tue, Oct 20, 2015 at 4:48 PM, charmee <charm...@gmail.com> wrote:
>>>>>>
>>>>>>> We had disabled tungsten after we found few performance issues, but
>>>>>>> had to
>>>>>>> enable it back because we found that when we had large number of
>>>>>>> group by
>>>>>>> fields, if tungsten is disabled the shuffle keeps failing.
>>>>>>>
>>>>>>> Here is an excerpt from one of our engineers with his analysis.
>>>>>>>
>>>>>>> With Tungsten Enabled (default in spark 1.5):
>>>>>>> ~90 files of 0.5G each:
>>>>>>>
>>>>>>> Ingest (after applying broadcast lookups) : 54 min
>>>>>>> Aggregation (~30 fields in group by and another 40 in aggregation) :
>>>>>>> 18 min
>>>>>>>
>>>>>>> With Tungsten Disabled:
>>>>>>>
>>>>>>> Ingest : 30 min
>>>>>>> Aggregation : Erroring out
>>>>>>>
>>>>>>> On smaller tests we found that joins are slow with tungsten enabled.
>>>>>>> With
>>>>>>> GROUP BY, disabling tungsten is not working in the first place.
>>>>>>>
>>>>>>> Hope this helps.
>>>>>>>
>>>>>>> -Charmee
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
>>>>>>> Sent from the Apache Spark Developers List mailing list archive at
>>>>>>> Nabble.com.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>>>>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>

Reply via email to