Re: Problems with broadcast large datastructure

lihu Sun, 12 Jan 2014 20:54:56 -0800

In my opinion, the spark system is for big data, then 400M seem not big .

I read slides about the broadcast, in my understanding, the executor will
send the broadcast variable back to the driver. each executor own a
complete copy of the broadcast variable.


In my experiment, I have 20 machine, each machine own 2 executor, and I
used the default parallelize, which is 8, so there  320  tasks in one stage
in total.

then the workers will send 320*(400M/8)=16G data back to the driver, this
seem very big. but I get from log that after serialize, the data size send
back to driver is just 446 byte in each task.

*org.apache.spark.storage.BlockManager - Found block broadcast_5 locally*
*org.apache.spark.executor.Executor - Serialized size of result for 1901 is
446*
*org.apache.spark.executor.Executor - Sending result for 1901 directly to
driver*

So the total data send back to driver just 320*446byte=142720byte. this is
really small in my opinion.

---------------
In summary

1.  Spark system is for big data, then 400M is not big in my opinion.
2.  I do not sure if my understanding for the broadcast is right, then the
data send back to the driver may bigger?
3.  I just wonder why the serialize rate is so hight, it can serialize the
400/8=50M to just 446 byte?
4.  If it is my fault that do not run the broadcast experiment in the right
way,  then I I hope the spark community can give more examples about the
broadcast, this may benefit many users.






On Mon, Jan 13, 2014 at 12:22 PM, Aureliano Buendia <[email protected]>wrote:

>
>
>
> On Mon, Jan 13, 2014 at 4:17 AM, lihu <[email protected]> wrote:
>
>> I have occurred the same problem with you .
>> I have a node of 20 machines, and I just run the broadcast example, what
>> I do is just change the data size in the example, to 400M, this is really a
>> small data size.
>>
>
> Is 400 MB a really small size for broadcasting?
>
> I had the impression that broadcast is for object much much smaller, about
> less than 10 MB.
>
>
>> but I occurred the same problem with you .
>> *So I wonder maybe the broadcast capacity is weak in the spark system?*
>>
>>
>> here is my config:
>>
>> *SPARK_MEM=12g*
>> *SPARK_MASTER_WEBUI_PORT=12306*
>> *SPARK_WORKER_MEMORY=12g*
>> *SPARK_JAVA_OPTS+="-Dspark.executor.memory=8g -Dspark.akka.timeout=600
>>  -Dspark.local.dir=/disk3/lee/tmp -Dspark.worker.timeout=600
>> -Dspark.akka.frameSize=10000 -Dspark.akka.askTimeout=300
>> -Dspark.storage.blockManagerTimeoutIntervalMs=100000
>> -Dspark.akka.retry.wait=600 -Dspark.blockManagerHeartBeatMs=80000 -Xms15G
>> -Xmx15G -XX:+UseConcMarkSweepGC -XX:-UseGCOverheadLimit"*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Jan 11, 2014 at 8:27 AM, Khanderao kand <[email protected]
>> > wrote:
>>
>>> If your object size > 10MB you may need to change spark.akka.frameSize.
>>>
>>> What is your spark, spark.akka.timeOut ?
>>>
>>> did you change   spark.akka.heartbeat.interval  ?
>>>
>>> BTW based on large size getting broadcasted across 25 nodes, you may want 
>>> to consider the frequency of such transfer and evaluate alternative 
>>> patterns.
>>>
>>>
>>>
>>>
>>> On Tue, Jan 7, 2014 at 12:55 AM, Sebastian Schelter <[email protected]>wrote:
>>>
>>>> Spark repeatedly fails broadcast a large object on a cluster of 25
>>>> machines for me.
>>>>
>>>> I get log messages like this:
>>>>
>>>> [spark-akka.actor.default-dispatcher-4] WARN
>>>> org.apache.spark.storage.BlockManagerMasterActor - Removing BlockManager
>>>> BlockManagerId(3, cloud-33.dima.tu-berlin.de, 42185, 0) with no recent
>>>> heart beats: 134689ms exceeds 45000ms
>>>>
>>>> Is there something wrong with my config? Do I have to increase some
>>>> timeout?
>>>>
>>>> Thx,
>>>> Sebastian
>>>>
>>>
>>>
>>
>>
>>
>

Re: Problems with broadcast large datastructure

Reply via email to