If you're in the Bay Area, you might be interested to attend the next 
installment of the Spark User Meetup, which will be on April 5th at UC 
Berkeley. To attend, *please register* at 
http://www.meetup.com/spark-users/events/58579402/ so we can get an accurate 
headcount.

This meetup will contain two talks:

Running Spark and Hadoop on a Private Cluster with Mesos
(Benjamin Hindman, UC Berkeley and Twitter)

This talk will cover how to deploy Spark to a cluster using the Apache Mesos 
cluster manager, and dynamically share resources with Hadoop MapReduce by 
running Hadoop through Mesos as well. It will focus on the upcoming 0.9 release 
of Mesos, which provides a variety of usability and fault tolerance fixes. We 
will demo how to set up and configure a cluster with Mesos, Spark, Hadoop 
MapReduce and HDFS starting from plain Linux machines. In addition, we'll cover 
practical issues such as how to find log files and debug your jobs.
 
Arthur: The Spark Debugger
(Ankur Dave, UC Berkeley)

Debugging large parallel jobs is hard, because the sheer scale of the 
computation makes it hard to track what's happening, inevitable weirdnesses in 
the data triggers errors, and it's difficult tell whether a program is 
performing efficiently. To tackle this problem, we are designing Arthur, a 
debugger for Spark programs that provides visibility into the computation and 
powerful analysis features. One key feature of Arthur is that it can leverage 
the deterministic nature of Spark programs to efficiently replay part of a 
parallel job. Using this capability, users can rerun any task in the job in a 
single-process debugger to step through it line by line, or rebuild any 
intermediate dataset in the job and query it interactively from the Spark 
shell. We are also using replay to build tracing capabilities, such as figure 
out which input records caused an output record. This talk will give an 
overview of the research going on in Arthur and cover several features that are 
already included in Spark. We also solicit your suggestions for improving 
debugging!

Pizza will also be provided starting at 7 PM, and the talks themselves start at 
7:30.

See you on Thursday!

Matei

Reply via email to