That message was pretty crazy, so I wanted to give some follow-up. I made 
some changes to my system, and now I was able to reach 100% CPU utilization 
most of the time. I am creating this map-reduce framework while at the same 
time using Spark to solve the same problem, and I believe I got a 40% time 
improvement over it. So that was cool.

I would still like to understand better how Akka is behaving in my system, 
though. What I am doing is similar to a big "word count" problem, with 1M 
lines and 10k words per line out of a possible 100k words vocabulary. The 
change I did was that instead of having each word to be counted traveling 
as a separate message to the final reducers, I am now moving iterators 
around. So I start reading a file chunk, then map functions lazily, and 
then I only actually parse and count the "words" up in the reducers, an 
then I later I add them up. Reducing the number of messages flowing 
apparently was a great help.

What is crazy is that I can still notice by following the process in htop 
that there is some cyclic behavior. The CPUs get to 100% with user load, 
but then move to ~100% kernel load, and then there is a brief periods of 
less than 100% occupation. Still looks like some kind of IO related 
locking, but it should be possible to have this program running like a 
steady flow of data processing... So I'm still looking for advice on how to 
debug this system. Is Kamon the best tool to make this kind of inspection?

Thanks, and sorry if this question is getting too weird or off-topic!
    ++nic



On Saturday, June 27, 2015 at 4:20:46 PM UTC-3, Nicolau Werneck wrote:
>
> I have created this small framework for running map-reduce data processing 
> jobs based on Akka. It's on github, 
> https://github.com/projetoeureka/akka-mapreduce. (I know, I could be 
> using Spark, but I still think it may pay off some day...)
>
> I have used it successfully with small tests and some medium sized jobs, 
> but now I'm finally trying to run a very large job. I have not yet tried to 
> use a cluster with separate machines, but I am using a 32-cores (actually 8 
> cores x4 threads AFAIK) AWS machine for this test.
>
> I am writing because I am a bit confused about the behavior I am seeing in 
> the machine when I execute the job. The program appears to be running fine, 
> I have some output indicating that. But if I open "htop" to look at the CPU 
> load, there is a weird behavior. The total CPU load oscillates, going to 
> almost 100% for a little time in bursts, and then moving down to a very low 
> value. There will be like 6 java threads running with 30% CPU load for a 
> while, and then 20 threads with 100% and then back to low. that happens 
> like every 20 seconds.
>
> It looks like some kind of file blocking, but I find it very difficult... 
> Also in my problem each line form the input generates lots of data to the 
> reducers, so the file IO really should not matter.
>
> Any ideas of what may be causing this, and how I could investigate? In my 
> 2x2 cores notebook I get 400% CPU utilization, so I don't even know exactly 
> how to reproduce the problem in other situations.
>
> Tha attachments show the weird behavior on htop.
>
> Thanks,
>     ++nic
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to