Hello,
I cam out with the following proposal which I believe needs alot of review.
I will appreciate if you can help me make appropriate corrections before
the deadline for submission.
Thanks @gyfora, @pariscarbone


*GSoC project: Query optimisation layer for Flink Streaming
<https://issues.apache.org/jira/browse/FLINK-1617>*

NAME: Wepngong Ngeh Benaiah

EMAIL: [email protected]

*SYNOPSIS*

I would very much like to participate for GSOC2015 with Apache
<http://apache.org/> working with Flink <http://flink.org/> streaming as my
way of contributing to open-source.

Flink streaming currently only supports a limited set of optimisations
applied on the streaming programs such as *operator chaining*, and several
optimisations for *windowing* *computations*.

Also, there is currently no optimizer as a separate module on its own.
Though *operator chaining *improves performance, alot more has to be done
to further improve system performance.

My project will be to implement a *Query Optimisation layer for Flink
Streaming. *This is supposed to do statistical graph analysis and streaming
graph optimization. This would bring major system performance improvements.

*H**OW **WOULD **THE COMMUNITY **BENEFIT FROM THIS**?*

Much of “big data” is received in real time, and is most valuable at its
time of arrival. For example, a social network may want to identify
trending conversation topics within minutes, an ad provider may want to
train a model of which users click a new ad, and a service operator may
want to mine log files to detect failures within seconds.

Big Data Analytics is greatly gaining grounds in all domains in industry
today and Flink is the solution. By reducing overheads and system
bottlenecks, the throughput of the companies will be improved and many more
people will to use and support the project.

*ABOUT ME*

I am an IT enthusiast and 3rd year Software Engineering student at the
University
of Buea <http://ubuea.cm/>, Cameroon pursuing a Bachelor of Engineering in
Computer Engineering. I have been programming in Java for 2years+, MySQL,
PostGRES, web application development in PHP (Laravel and Yii frameworks),
3 years experience with C programming language, Linux System Administration
and recently, Stream Processing. I'm currently in my 2nd Semester of my 3rd
year and will be on Internship at Orange Cameroon <http://www.orange.cm/en/>,
a mobile telecommunications company by September 2015.

I have contributed to https://github.com/ch3ck/sams where work on the
student attendance management system is still going on,
https://github.com/NetLogo/NetLogo and.

Finally, this is my github account: https://github.com/bwepngong and Google
Plus: https://plus.google.com/+WepngongBenaiahNgeh

I am finishing my B.Eng at the University of Buea in Cameroon in December
2016.

*Milestones*

*30**th** March-2**7**th** April 2015*

 *1. Understand how flink streaming works look into the streamgraph and the
stramingjobgraphbuilder and start doing simpler things with flink*

*2. **Design and analysis of the entire system.*

*3**. Ask questions in mailing lists for **clarifications.*

*27**th** April – 26 June**(Mid term)*

Implement

   1.

    OPERATOR REORDERING Means changing the order in which the operators
   appear in the stream graph to eliminate overheads.
   2.

   Perform unit testing for this algorithm.

*27**th** June – 13 August *

*Implement*

   1.

   REDUNDANCY ELIMINATION: Eliminate redundant computations by analysing
   the streaming graph using the *RETE algorithm *and remove duplicate
   operators which are not necessary. When other operators depend on
   another, compute that operator once only and share between other operators.

 2. Perform unit testing

*13**th** August - 21 August*

   1.

   Integrate modules and do system testing

*22nd August –* *28th August(final evaluation)*

Polish testing and get the required code samples ready

*29th August –8th November*

   1.

   More testing
   2.

   Code documentation.
   3.

   And debugging

Reply via email to