I am having similar issues with much smaller data sets. I am using spark
EC2 scripts to launch clusters, but I almost always end up with straggling
executors that take over a node's CPU and memory and end up never finishing.



On Thu, Mar 20, 2014 at 1:54 PM, Soila Pertet Kavulya <skavu...@gmail.com>wrote:

> Hi Reynold,
>
> Nice! What spark configuration parameters did you use to get your job to
> run successfully on a large dataset? My job is failing on 1TB of input data
> (uncompressed) on a 4-node cluster (64GB memory per node). No OutOfMemory
> errors just lost executors.
>
> Thanks,
>
> Soila
> On Mar 20, 2014 11:29 AM, "Reynold Xin" <r...@databricks.com> wrote:
>
>> I'm not really at liberty to discuss details of the job. It involves some
>> expensive aggregated statistics, and took 10 hours to complete (mostly
>> bottlenecked by network & io).
>>
>>
>>
>>
>>
>> On Thu, Mar 20, 2014 at 11:12 AM, Surendranauth Hiraman <
>> suren.hira...@velos.io> wrote:
>>
>>> Reynold,
>>>
>>> How complex was that job (I guess in terms of number of transforms and
>>> actions) and how long did that take to process?
>>>
>>> -Suren
>>>
>>>
>>>
>>> On Thu, Mar 20, 2014 at 2:08 PM, Reynold Xin <r...@databricks.com>
>>> wrote:
>>>
>>> > Actually we just ran a job with 70TB+ compressed data on 28 worker
>>> nodes -
>>> > I didn't count the size of the uncompressed data, but I am guessing it
>>> is
>>> > somewhere between 200TB to 700TB.
>>> >
>>> >
>>> >
>>> > On Thu, Mar 20, 2014 at 12:23 AM, Usman Ghani <us...@platfora.com>
>>> wrote:
>>> >
>>> > > All,
>>> > > What is the largest input data set y'all have come across that has
>>> been
>>> > > successfully processed in production using spark. Ball park?
>>> > >
>>> >
>>>
>>>
>>>
>>> --
>>>
>>> SUREN HIRAMAN, VP TECHNOLOGY
>>> Velos
>>> Accelerating Machine Learning
>>>
>>> 440 NINTH AVENUE, 11TH FLOOR
>>> NEW YORK, NY 10001
>>> O: (917) 525-2466 ext. 105
>>> F: 646.349.4063
>>> E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
>>> W: www.velos.io
>>>
>>
>>

Reply via email to