Greetings. I wanted to share with you our open source project we called
'akka-mapreduce' for the lack of a better name. We are developing a small
Map/Reduce framework using Akka in Scala.
https://github.com/projetoeureka/akka-mapreduce
The idea is to run map-reduce jobs keeping the aggregated output in memory
all the way to the end. It is a more lightweight alternative to Spark,
Storm or Hadoop Streaming directed to ad-hoc data processing problems.
When your data is too big you have to read it iteratively, and that makes
the problem unsuitable for things like Scala parallel collections.
Akka-mapreduce attempts to be a better fit for those situations when, while
you can't load the input to the memory all at once, it is small enough to
be processed by a monolithic application in a single multi-core machine.
We have developed a reducer component with a "decimator" actor that allows
us to know that all the data has been processed. We are also creating a
builder class that lets you define a processing pipeline like this
val pipeline = pipe_mapkv {
row: String => row split raw"\s+"
} times 4 map {
word => Some(KeyVal(word.trim.toLowerCase, 1))
} times 4 reduce (_ + _) times 8 output self
We started the project for our own purposes, but we thought it might be of
interest to someone else, and decided to publish it. It is still a very
young project, so there are many rough edges and things are changing very
fast, but we would love to hear some feedback from the community.
There appears to be a lot of Akka map-reduce examples out there on the
Internet, but few of them are in Scala, and even fewer offer a really good
solution to properly reading the data and to figure out that processing has
ended. We are not sure the framework will be adopted by anyone, but it
might have a couple of ideas that can be borrowed by your individual
projects.
Cheers,
++nic
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.