Thanks for your comments @rmetzger, @mbalassi. I will do necessary corrections and put it up again for review.
On Tue, Mar 24, 2015 at 10:01 PM, Robert Metzger <[email protected]> wrote: > Just a quick ping on this for the streaming folks: The deadline for the > proposal submissions is Friday, so the GSoC applicants need to get our > feedback asap. > The student asked me today in the #flink channel whether we can review > this proposal. > > > I have the following comments regarding the proposal: > - I don't exactly understand how you've chosen the dates for the > milestones. According to > https://www.google-melange.com/gsoc/events/google/gsoc2015 the coding > phase begins at 25 May and ends on 21 August. It seems that you are > suggesting to start with the implementation before the offical GSoC start > date. > I would suggest to align the milestones with the official GSoC timeline > (or at least justify in the proposal why you're deviating from that) > - Can you explain a bit more how you are planning to do operator > reordering and how the "rete algorithm" is working. Also some background on > why you've chosen that algorithm would be helpful. > > > > On Sun, Mar 22, 2015 at 4:44 AM, Wepngong Benaiah <[email protected]> > wrote: > >> Hello, >> I cam out with the following proposal which I believe needs alot of >> review. I will appreciate if you can help me make appropriate corrections >> before the deadline for submission. >> Thanks @gyfora, @pariscarbone >> >> >> *GSoC project: Query optimisation layer for Flink Streaming >> <https://issues.apache.org/jira/browse/FLINK-1617>* >> >> NAME: Wepngong Ngeh Benaiah >> >> EMAIL: [email protected] >> >> *SYNOPSIS* >> >> I would very much like to participate for GSOC2015 with Apache >> <http://apache.org/> working with Flink <http://flink.org/> streaming as >> my way of contributing to open-source. >> >> Flink streaming currently only supports a limited set of optimisations >> applied on the streaming programs such as *operator chaining*, and >> several optimisations for *windowing* *computations*. >> >> Also, there is currently no optimizer as a separate module on its own. >> Though *operator chaining *improves performance, alot more has to be >> done to further improve system performance. >> >> My project will be to implement a *Query Optimisation layer for Flink >> Streaming. *This is supposed to do statistical graph analysis and >> streaming graph optimization. This would bring major system performance >> improvements. >> >> *H**OW **WOULD **THE COMMUNITY **BENEFIT FROM THIS**?* >> >> Much of “big data” is received in real time, and is most valuable at its >> time of arrival. For example, a social network may want to identify >> trending conversation topics within minutes, an ad provider may want to >> train a model of which users click a new ad, and a service operator may >> want to mine log files to detect failures within seconds. >> >> Big Data Analytics is greatly gaining grounds in all domains in industry >> today and Flink is the solution. By reducing overheads and system >> bottlenecks, the throughput of the companies will be improved and many more >> people will to use and support the project. >> >> *ABOUT ME* >> >> I am an IT enthusiast and 3rd year Software Engineering student at the >> University >> of Buea <http://ubuea.cm/>, Cameroon pursuing a Bachelor of Engineering >> in Computer Engineering. I have been programming in Java for 2years+, >> MySQL, PostGRES, web application development in PHP (Laravel and Yii >> frameworks), 3 years experience with C programming language, Linux >> System Administration and recently, Stream Processing. I'm currently in >> my 2nd Semester of my 3rd year and will be on Internship at Orange >> Cameroon <http://www.orange.cm/en/>, a mobile telecommunications company >> by September 2015. >> >> I have contributed to https://github.com/ch3ck/sams where work on the >> student attendance management system is still going on, >> https://github.com/NetLogo/NetLogo and. >> >> Finally, this is my github account: https://github.com/bwepngong and >> Google Plus: https://plus.google.com/+WepngongBenaiahNgeh >> >> I am finishing my B.Eng at the University of Buea in Cameroon in December >> 2016. >> >> *Milestones* >> >> *30**th** March-2**7**th** April 2015* >> >> *1. Understand how flink streaming works look into the streamgraph and >> the stramingjobgraphbuilder and start doing simpler things with flink* >> >> *2. **Design and analysis of the entire system.* >> >> *3**. Ask questions in mailing lists for **clarifications.* >> >> *27**th** April – 26 June**(Mid term)* >> >> Implement >> >> 1. >> >> OPERATOR REORDERING Means changing the order in which the operators >> appear in the stream graph to eliminate overheads. >> 2. >> >> Perform unit testing for this algorithm. >> >> *27**th** June – 13 August * >> >> *Implement* >> >> 1. >> >> REDUNDANCY ELIMINATION: Eliminate redundant computations by analysing >> the streaming graph using the *RETE algorithm *and remove duplicate >> operators which are not necessary. When other operators depend on >> another, compute that operator once only and share between other >> operators. >> >> 2. Perform unit testing >> >> *13**th** August - 21 August* >> >> 1. >> >> Integrate modules and do system testing >> >> *22nd August –* *28th August(final evaluation)* >> >> Polish testing and get the required code samples ready >> >> *29th August –8th November* >> >> 1. >> >> More testing >> 2. >> >> Code documentation. >> 3. >> >> And debugging >> >> >> >> > -- Wepngong Ngeh Benaiah "The similarities of sysadmins and drug dealers: both measure stuff in Ks, and both have users."
