[akka-user] Re: Multi-Machine Distributed Work

Eric Pederson Fri, 10 Jan 2014 18:52:11 -0800

Hi Ryan - I'm curious to learn more about how you are using the 
cluster-aware router and what it provides over the basic clustering 
membership functionality.


Matt - we are also using a variation of "Balancing Workload Across Nodes" 
(pre Akka Cluster) and it works great.

Thanks,

On Friday, January 10, 2014 9:15:20 AM UTC-5, Ryan Tanner wrote:
>
> We use that pattern in conjunction with clustering.  We use a 
> cluster-aware router to deal with node membership within a worker role but 
> we never actually send messages to the router's ActorRef, we let the 
> routees register themselves with the "role leader" as we call it when they 
> start.
>
> On Friday, January 10, 2014 1:29:37 AM UTC-6, Hendrik Schöneberg wrote:
>>
>> As an alternative to the cluster approach you could check out 
>>
>>
>> http://letitcrash.com/post/29044669086/balancing-workload-across-nodes-with-akka-2
>>   
>>
>> This pattern allows you to dynamically add (local or remote) workers to a 
>> master actor.
>>
>> Hendrik
>>
>> Am Donnerstag, 9. Januar 2014 23:52:31 UTC+1 schrieb Matt Edwards:
>>>
>>> Hello,
>>>
>>> I am doing some initial research into distributed computing for a 
>>> project I work on, and I am wondering if AKKA may be that solution. Below I 
>>> list the project essentials, and a few design Ideas. I would appreciate any 
>>> feedback to help narrow down the amount of literature I need to read up on. 
>>>
>>>
>>> The project is an analysis tool. and only runs when the user clicks the 
>>> go button. Currently this process takes around 45min (which by comparative 
>>> software, is lighting fast). However one phase of this analysis is taking 
>>> around 30minutes to complete. This phase is a data extrapolator (takes a 
>>> set of input, and predict future data over a time range). This phase is 
>>> also 100% independent of any other peice of data (part A doesnt care about 
>>> part B). For this reason AKKA seems great. On the not so great side. I need 
>>> to block the mainline process until the extrapolation is done, else the 
>>> analysis doesnt have the data needed.
>>>
>>> Also to Note: This is all Java based. I have no experience in Scala.
>>>
>>> The current implementation in java builds a List<Runnable> for all the 
>>> data points. Creates a CountDownLatch(list.size()) and then Submits the 
>>> list to the Execution Service. This spins up one Thread in the CPU per 
>>> Runnable. When the Future is available the code pulls it, stores it in a 
>>> map, and decrements the latch. once the latch is done, the code proceeds.
>>>
>>> I currently peg a 24core (4x AMD Opteron(tm) Processor 8435 2.6GHz) 
>>> server for the full 30 minutes. I have also tested on an 80core server 
>>> machine, to the same result, max thread usage, though quicker results. This 
>>> process is extremely predictable in performance as well. Double the cores, 
>>> half the time. On my normal 24core machine, each thread takes between 25-30 
>>> seconds to complete. So transmission times should be small compared to 
>>> process times. With this in mind. it is far cheaper to find 24 core 
>>> machines than 80 cores. It is also advantageous to use some decommission 
>>> equipment, that has perfectly useable CPU power.  (also note: memory is not 
>>> a limiting factor in this). These machine will all be in the same server 
>>> room, I dont plan on broadcasting cross country for this.
>>>
>>>
>>>
>>> The way I have interpreted AKKA is that  each "thread" is run though use 
>>> of a actor to complete, rather than the execution service. But I am 
>>> wondering how does data get from a "master" and routed out to anther 
>>> machine? how capable is it for multi-machine distribution? And what is the 
>>> better approach to it. Clustering/Routing/Remote? The Master actor can have 
>>> a router attached, can that router broadcast to remote actors? Should those 
>>> remote receivers be the final actor, or another router who them calls the 
>>> workers on the local machine? 
>>>
>>> I am still coming up to speed on how distributed computing and 
>>> clustering function, so forgive me if anything seems trivial or irrelevant.
>>>
>>> Essentially, what I would like to see, once everything is done, is 
>>> having 3-5 servers (plus the main) crunching the data extrapolation 
>>> simultaneously. thus cutting my time drastically. 
>>>
>>> I will be using  the 2.3-M2 code base for now. My program is OSGI based, 
>>> and this Version of akka was by far more friendly, though still took some 
>>> work in integrate.
>>>
>>> If I've read the documentation correctly, I can deploy the same 
>>> application to every machine, and just tweak the config files. Will I need 
>>> an activator on the OSGI component for the remote machines? The "master" 
>>> program will only be on one machine, i just want to steal the resources of 
>>> another machine temporarily. It should also be able to have machines come 
>>> and go from service. Though I think the config files handle that as 
>>> well...correct?  Or is all of that prior info mis-interpreted, and i need a 
>>> special app on the remote machines?
>>>
>>> I know its kind of long winded, I just to be thorough. . Thanks for 
>>> reading, and thanks for the help.
>>>
>>> Matt
>>>
>>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: http://akka.io/faq/
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

[akka-user] Re: Multi-Machine Distributed Work

Reply via email to