http://www.meetup.com/Bay-Area-Stream-Processing/events/219086133/
Thursday, June 4, 2015 6:45 PM TubeMogul <http://maps.google.com/maps?f=q&hl=en&q=1250+53rd%2C+Emeryville%2C+CA%2C+94608%2C+us> 1250 53rd St #1 Emeryville, CA 6:45PM to 7:00PM - Socializing 7:00PM to 8:00PM - Talks 8:00PM to 8:30PM - Socializing Speaker : *Bill Zhao (from TubeMogul)* Bill was working as a researcher in the UC Berkeley AMP lab during the creation of Spark and Tachyon, and worked on improving Spark memory utilization and Spark Tachyon integration. The AMP lab Working at the intersection of three massive trends: powerful machine learning, cloud computing, and crowdsourcing, the AMPLab is integrating Algorithms, Machines, and People to make sense of Big Data. Topic: *Introduction to Spark and Tachyon* Description: Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, etc. It is designed to perform both batch processing (similar to MapReduce). Tachyon is a memory-centric distributed storage system enabling reliable data sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. It achieves high performance by leveraging lineage information and using memory aggressively. Tachyon caches working set files in memory, thereby avoiding going to disk to load datasets that are frequently read. This enables different jobs/queries and frameworks to access cached files at memory speed.