Hi Let me add some notes based on my experience and further thoughts on this.
Am Montag, 2. Juni 2014 17:51:06 UTC+2 schrieb Evan Chan: > > > Here are some characteristics of such a platform: > > > - > > Service discovery > > We've been using Zookeeper for discovering individual Akka based services on the network. ZK does this job very well. We tried Akka cluster when it was still experimental. Later we decided that a strong consistency model would be a better fit for the solution we needed to create. But it largely depends on your requirements. If you need to spin up thousands of JVMs then Akka cluster with its gossip protocol will scale very well. However, what we wanted to do is to create a low latency platform using a cell architecture[0], so you'd not have to deal with that many services. In our case a single ZK instance can handle a hundred services just fine. One important aspect thats just as important as discovery is error handling. Having a single consistent view on your cluster certainly makes it easier to decide whenever a node needs to be removed. From my experience akka cluster using its gossip protocol can sometimes make it hard to find out exactly why nodes have been removed from the cluster (which is important to know for our customers). But thats just anecdotal evidence from my side. Maybe gossip based service discovery can also work fine, but I don't really see the benefits over zookeeper as long as you don't really need to run a very large cluster. > > - > > Supporting different kinds of data flow topologies - request response, > as well as streaming data; pub-sub, etc. > > Very interesting point. But how far would you go? Theres distributed platforms such as Storm [1] (not exactly a micro services platform, I know) that will provide features such as fault tolerance based its idea of how a flow topology should work. On the other hand you're loosing flexibility whenever you need to define a static topology of your services. Pub-sub is really nice to work with as a developer. But its not suitable for every use-case and can make implementing data flows hard to follow. It would be great to have a platform that would support multiple models based on what you need. In this regard I always liked zeromq[2] for offering you a great deal of options for that, without getting in your way. > > - > > Provide common abstractions for efficient data serialization > > Akka is pretty good at handling data serialization. Maybe each service description should also specify what kind of serialization protocol is used by the Akka endpoint. But I'd rather see this handled by Akka remoting instead of a microservices framework. E.g. it would be really nice being able to support multiple serialization protocols for your remote actors. Akka would need to figure out which one has been used by the sender and select the appropriate protocol on the receiver end (if supported). > > - > > Support backpressure and flow control, to rate limit requests > > Lack of back pressure is by far the hardest problem i've came by when dealing with distributed akka applications. Especially for low latency, high throughput systems. Now I'm very happy to see reactive streams happening! Obviously I'm not the only person feeling the pain with this. However, what does back pressure mean in context of microservices. If you need to call a service and none is available, how will you handle this on framework level? If there's data waiting to be processed but no service running for accepting this data, what does this mean for your data producers? When people talk about back pressure most of the time the discussion is reduced to mailbox congestion and producer/consumer interaction. But if you have a set of services all talking with each other without a well defined topology model, enabling back pressure in terms of transmission throttling for end to end communication isn't enough. For example, say I have a billing service which would pull pending orders from a queue. Each order would be send to a payment service and afterwards to a mail confirmation service, which in turn also talks to a recommendation service to retrieve a list of items to suggest for further purchase to include in the order confirmation mail. Now in this case, what happens if the recommendation service is down? From a business perspective, its preferred to just keep sending confirmation mails without any recommendations and keep the billing process going. The developer should always be able to decide what to do in case of any end to end service interaction fails. Automatic back pressure could potentially be more dangerous than useful in those situations. In the microservices platform I'd love to use as a developer, I'd always be able to change the way how I interact with services based on the current cluster state and metrics. Services flows should be able to degenerate in case non-critical interactions won't be possible or certain services would just be slow. > > - > > Support easy scaling of each component, including routing of messages > or requests to multiple instances > > Creating solid routing algorithms in a push topology will be though. One possible option would be to take all remote mailbox sizes into account and calculate the average digestion rate. You'd need to consider which service is able to process the message in an acceptable time frame while routing a message to a remote service. Else some kind of backpressure would need to apply. I'm not a big fan of auto scaling. Deploying new instances should be left to the team. There's alot of great containers nowadays, such as Mesos, which can help you greatly distributing JVM processes. But I'd say in most cases people would start to use microservices on a fixed number of systems. Using Akka theres already alot of options on how to make your application scale. Adding further scale out options as part of a microservices framework is probably not needed. > > - > > A common platform for application metrics > > Again Zookeeper can be very handy for sharing metrics which have been collected using frameworks like codahale metrics. Standard metrics, such as message rates and jvm stats could be provided out of the box by the platform. Custom metrics could be added by the developer. The major question seems to me how metrics should be shared, either in a strong consistency or eventual consistency model. This would depend alot how these values will be used. If you're going to implement load balancing algorithms on top of you metrics (such as message rates) you better make sure to have a recent, consistent picture on whats going on in your cluster. > > - > > Distributed message or request tracing, to help with visibility and > debugging > > Theres already solutions for that [3]. Its probably also more in the scope of the Akka framework itself instead of a microservices framework. > > - > > Support polyglot development - it should be possible to develop > services in different languages > > Might also be out of scope and should be left to Akka how to integrated other languages. [0] http://highscalability.com/blog/2012/5/9/cell-architectures.html [1] http://storm.incubator.apache.org/ [2] http://zeromq.org/ [3] https://github.com/levkhomich/akka-tracing/ -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
