Lets start aggregating student proposals on the wiki <https://cwiki.apache.org/confluence/display/ZEPPELIN/GoogleSummerOfCode#GoogleSummerOfCode-StudentProposals>
Below are my thoughts and I'd love to volunteer to be a mentor for this project, feedback is very welcome. This is deliberately an open-ended project, so we need to work together to define a possible scope. *Main Idea:* Use an open data <http://www.kdnuggets.com/datasets/index.html> (any dataset with compatible licence) to build a set of Zeppelin notebooks using existing ML tools, which show how Zeppelin can help a data scientists in their day to day tasks (cleaning the data, building the model, using it). Extra bonus will be to use modern Deeplearing techniques i.e to work\classify Images or any kind of NLP. Good examples could be past Kaggle competitions, like Titanic <https://www.kaggle.com/c/titanic-gettingStarted/details/new-getting-started-with-r> and all others. There must be a lot of different ways to approach this so it leaves a space for creative proposals. List of possible tools include Python\R\Mahout\MLlib\PredictionIO\H20\Sparkling-water\SINGA etc Updates, directions, suggestions that can help students to make a good proposals are more then welcome! -- Kind regards, Alexander.
