Hi folks,

Just as a heads-up, I think we're getting close to the major features to make a 
Spark 0.8.1 release, and we'd like to merge the Scala 2.10 branch into master 
to facilitate work on Spark 0.9. I'm thinking of doing this in the next week. 
This won't mean that development on Scala 2.9 will stop -- that will keep going 
on in the spark-0.8 branch, where we've been cherry-picking nearly every change 
from master. But it will make it easier to do the next phase of development 
(configuration system, changes to run scripts, etc) for Scala 2.10 and Spark 
0.9. Let me know if you have any concerns about this.

By the way, there's been a lot of stuff contributed to 0.8.1 in the month since 
we released 0.8! Here are some of the things either merged or close to merging:

- Standalone master fault tolerance
- Shuffle file consolidation (improves performance of big shuffles)
- Better P2P broadcast (improves speed and stability of big broadcasts)
- Optimized hashtable classes
- Sending of large task results through block manager (improves performance)
- Sort() function in PySpark
- New ALS variant for implicit feedback
- Task and job killing

Matei

Reply via email to