Hi

Let me add some notes based on my experience and further thoughts on this.

Am Montag, 2. Juni 2014 17:51:06 UTC+2 schrieb Evan Chan:
>
>
> Here are some characteristics of such a platform:
>
>
>    - 
>    
>    Service discovery
>    
>
We've been using Zookeeper for discovering individual Akka based services 
on the network. ZK does this job very well. We tried Akka cluster when it 
was still experimental. Later we decided that a strong consistency model 
would be a better fit for the solution we needed to create. But it largely 
depends on your requirements. If you need to spin up thousands of JVMs then 
Akka cluster with its gossip protocol will scale very well. However, what 
we wanted to do is to create a low latency platform using a cell 
architecture[0], so you'd not have to deal with that many services. In our 
case a single ZK instance can handle a hundred services just fine. 
One important aspect thats just as important as discovery is error 
handling. Having a single consistent view on your cluster certainly makes 
it easier to decide whenever a node needs to be removed. From my experience 
akka cluster using its gossip protocol can sometimes make it hard to find 
out exactly why nodes have been removed from the cluster (which is 
important to know for our customers). But thats just anecdotal evidence 
from my side. Maybe gossip based service discovery can also work fine, but 
I don't really see the benefits over zookeeper as long as you don't really 
need to run a very large cluster.

 

>
>    - 
>    
>    Supporting different kinds of data flow topologies - request response, 
>    as well as streaming data; pub-sub, etc.
>    
>
Very interesting point. But how far would you go? Theres distributed 
platforms such as Storm [1] (not exactly a micro services platform, I know) 
that will provide features such as fault tolerance based its idea of how a 
flow topology should work. On the other hand you're loosing flexibility 
whenever you need to define a static topology of your services. 
Pub-sub is really nice to work with as a developer. But its not suitable 
for every use-case and can make implementing data flows hard to follow. It 
would be great to have a platform that would support multiple models based 
on what you need. In this regard I always liked zeromq[2] for offering you 
a great deal of options for that, without getting in your way. 

 

>
>    - 
>    
>    Provide common abstractions for efficient data serialization
>    
>
Akka is pretty good at handling data serialization. Maybe each service 
description should also specify what kind of serialization protocol is used 
by the Akka endpoint. But I'd rather see this handled by Akka remoting 
instead of a microservices framework. E.g. it would be really nice being 
able to support multiple serialization protocols for your remote actors. 
Akka would need to figure out which one has been used by the sender and 
select the appropriate protocol on the receiver end (if supported).
 

>
>    - 
>    
>    Support backpressure and flow control, to rate limit requests
>    
>
Lack of back pressure is by far the hardest problem i've came by when 
dealing with distributed akka applications. Especially for low latency, 
high throughput systems. Now I'm very happy to see reactive streams 
happening! Obviously I'm not the only person feeling the pain with this. 

However, what does back pressure mean in context of microservices. If you 
need to call a service and none is available, how will you handle this on 
framework level? If there's data waiting to be processed but no service 
running for accepting this data, what does this mean for your data 
producers? When people talk about back pressure most of the time the 
discussion is reduced to mailbox congestion and producer/consumer 
interaction. But if you have a set of services all talking with each other 
without a well defined topology model, enabling back pressure in terms of 
transmission throttling for end to end communication isn't enough. 
For example, say I have a billing service which would pull pending orders 
from a queue. Each order would be send to a payment service and afterwards 
to a mail confirmation service, which in turn also talks to a 
recommendation service to retrieve a list of items to suggest for further 
purchase to include in the order confirmation mail. Now in this case, what 
happens if the recommendation service is down? From a business perspective, 
its preferred to just keep sending confirmation mails without any 
recommendations and keep the billing process going. The developer should 
always be able to decide what to do in case of any end to end service 
interaction fails. Automatic back pressure could potentially be more 
dangerous than useful in those situations. In the microservices platform 
I'd love to use as a developer, I'd always be able to change the way how I 
interact with services based on the current cluster state and metrics. 
Services flows should be able to degenerate in case non-critical 
interactions won't be possible or certain services would just be slow. 

 

>
>    - 
>    
>    Support easy scaling of each component, including routing of messages 
>    or requests to multiple instances
>    
>
Creating solid routing algorithms in a push topology will be though. One 
possible option would be to take all remote mailbox sizes into account and 
calculate the average digestion rate. You'd need to consider which service 
is able to process the message in an acceptable time frame while routing a 
message to a remote service. Else some kind of backpressure would need to 
apply. 

I'm not a big fan of auto scaling. Deploying new instances should be left 
to the team. There's alot of great containers nowadays, such as Mesos, 
which can help you greatly distributing JVM processes. But I'd say in most 
cases people would start to use microservices on a fixed number of systems. 
Using Akka theres already alot of options on how to make your application 
scale. Adding further scale out options as part of a microservices 
framework is probably not needed. 
 
  

>
>    - 
>    
>    A common platform for application metrics
>    
>
Again Zookeeper can be very handy for sharing metrics which have been 
collected using frameworks like codahale metrics. Standard metrics, such as 
message rates and jvm stats could be provided out of the box by the 
platform. Custom metrics could be added by the developer. The major 
question seems to me how metrics should be shared, either in a strong 
consistency or eventual consistency model. This would depend alot how these 
values will be used. If you're going to implement load balancing algorithms 
on top of you metrics (such as message rates) you better make sure to have 
a recent, consistent picture on whats going on in your cluster. 

 

>
>    - 
>    
>    Distributed message or request tracing, to help with visibility and 
>    debugging
>    
>
Theres already solutions for that [3]. Its probably also more in the scope 
of the Akka framework itself instead of a microservices framework. 
 

>
>    - 
>    
>    Support polyglot development - it should be possible to develop 
>    services in different languages
>    
>
Might also be out of scope and should be left to Akka how to integrated 
other languages.



[0] http://highscalability.com/blog/2012/5/9/cell-architectures.html
[1] http://storm.incubator.apache.org/
[2] http://zeromq.org/
[3] https://github.com/levkhomich/akka-tracing/
 

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to