Re: Writeup on Spark SQL with GDELT

2014-08-25 Thread Michael Armbrust
Thanks for this very thorough write-up and for continuing to update it as you progress! As I said in the other thread it would be great to do a little profiling to see if we can get to the heart of the slowness with nested case classes (very little optimization has been done in this code path).

Writeup on Spark SQL with GDELT

2014-08-21 Thread Evan Chan
I just put up a repo with a write-up on how to import the GDELT public dataset into Spark SQL and play around. Has a lot of notes on different import methods and observations about Spark SQL. Feel free to have a look and comment. http://www.github.com/velvia/spark-sql-gdelt