Greetings. I wanted to share with you our open source project we called 
'akka-mapreduce' for the lack of a better name. We are developing a small 
Map/Reduce framework using Akka in Scala.
    https://github.com/projetoeureka/akka-mapreduce

The idea is to run map-reduce jobs keeping the aggregated output in memory 
all the way to the end. It is a more lightweight alternative to Spark, 
Storm or Hadoop Streaming directed to ad-hoc data processing problems.

When your data is too big you have to read it iteratively, and that makes 
the problem unsuitable for things like Scala parallel collections. 
Akka-mapreduce attempts to be a better fit for those situations when, while 
you can't load the input to the memory all at once, it is small enough to 
be processed by a monolithic application in a single multi-core machine.

We have developed a reducer component with a "decimator" actor that allows 
us to know that all the data has been processed. We are also creating a 
builder class that lets you define a processing pipeline like this

val pipeline = pipe_mapkv {
  row: String => row split raw"\s+"
} times 4 map {
  word => Some(KeyVal(word.trim.toLowerCase, 1))
} times 4 reduce (_ + _) times 8 output self

We started the project for our own purposes, but we thought it might be of 
interest to someone else, and decided to publish it. It is still a very 
young project, so there are many rough edges and things are changing very 
fast, but we would love to hear some feedback from the community.

There appears to be a lot of Akka map-reduce examples out there on the 
Internet, but few of them are in Scala, and even fewer offer a really good 
solution to properly reading the data and to figure out that processing has 
ended. We are not sure the framework will be adopted by anyone, but it 
might have a couple of ideas that can be borrowed by your individual 
projects.

Cheers,
    ++nic

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to