Hey Evgenii,

Cc'ing my answer to hpx-users

> My name's Evgenii. I'm a second year student of BSUIR.
> I want to take part in GSOC 2016, I didn't participated before.
> 
> I want to implement a Map/Reduce Framework.
> I have a good knowledge of C, C++, Github, Gerrit. I have no experience in
> Open Source,
> 
> I think that this task may be divided into 3 parts
> 1. RPC (Remote Procedure Call)
> 2. Map
> 3. Reduce
> 
> I think that a good start will be to develop application, that will apply
> a filter to the image.
> For example, i will divide my image for 4 part, then sent to nodes, apply
> filter, get image, and finally, fold image. As a result this application
> will be a good base to start work on the framework.

Welcome to GSoC 2016!

We already responded to another student who is looking into this project. Here 
is the essence of this conversation:

> 1) Instead of Map/Reduce it would be much more rewarding to implement
> Google Dataflow Model as it would provide efficient handling of both batch
> processing and real time stream processing.

Yes! Good thinking.

> 2) Along with Dataflow model, I would also borrow some of the features
> from MillWheel [1] and FlumeJava [2] (features such as Fault-tolerance,
> running efficient data parallel pipelines, etc).

Perfect. Do you have something more concrete in mind? Any use cases? Design 
ideas?

> 3) Construct an execution model as directed graph which would make better
> optimisation than Map/Reduce, this approach would be useful as complex
> optimisation would require multiple map/reduce steps.

Nod, that's what dataflow/async/futures etc. can give you. The only thing to 
keep in mind is that the execution tree generated by those is implicit, i.e. 
not directly accessible. In our experience this is not a problem, however.

> Finally, I would really appreciate if you could please look into above
> steps and further help me with reviews and other possible
> idea's/approach's for the project :)

We definitely will be here to discuss things as you start putting out ideas, 
questions, suggestions, etc. I think you have already started looking at HPX 
itself, if not - it might be a good time to start doing so.


The above is still true. The idea we had was to find a way to substitute 
Map/Reduce with a new, dataflow-based programming model which overcomes the 
disadvantages of Map/Reduce (it imposes at least 2 global barriers onto the 
computation, thus reducing the possible parallelism, etc.). In general, I'd ask 
you to keep poking at the problem and to keep developing your design. It might 
be a good idea if you tried to outline things in a bit more detail for us to 
really grasp what you're up to.

HTH
Regards Hartmut
---------------
http://boost-spirit.com
http://stellar.cct.lsu.edu



_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to