Hello, Third notebook is ready for review at [0]. This notebook uses Apache Flink for the analysis of data and the visualizations using html display and Helium application.
The documentation and blog for the notebook is ready for review at [1]. The zeppelin hub viewer link for the notebook is at [2]. Since a lot of the visualizations don't display well in the zeppelin hub viewer, a video showing one scroll down the entire notebook on my zeppelin instance is shared at [3]. Thanks for the midterm feedback, I learned that I have to 'scalable bring data in pipelines' by which I understood that I have to get data in more streaming fashion(maybe directly into zeppelin from the data source). Please correct me if I'm wrong. In the meantime, I will start work on the fourth notebook on the common crawl datasets. [0]. https://github.com/anish18sun/Zeppelin-Notebooks/tree/master/2BP262655 [1]. http://zeppelinnotes.blogspot.in/2016/07/transportation.html [2]. https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL2FuaXNoMThzdW4vWmVwcGVsaW4tTm90ZWJvb2tzL21hc3Rlci8yQlAyNjI2NTUvbm90ZS5qc29u [3]. https://drive.google.com/open?id=0ByXTtaL2yHBuZ0FNdkNuV2JTUjA Thanks, Anish.