Here is the correct link: https://cwiki.apache.org/confluence/display/IGNITE/IEP-17%3A+Oil+Change+in+Service+Grid
I have looked at the tickets there, and I believe that we should not support peer-deployment for services. It is very hard and I do not think we should even try. I am proposing closing this ticket as Won't Fix - https://issues.apache.org/jira/browse/IGNITE-975 D. On Wed, Apr 4, 2018 at 5:39 AM, Denis Mekhanikov <dmekhani...@gmail.com> wrote: > Vyacheslav, > > I've just posted my first draft of the IEP: > https://cwiki.apache.org/confluence/display/IGNITE/IEP-17%3A+Service+grid+ > improvements > It's not finished yet, but you can get the idea from it. > If you have some thoughts on your mind, please let me know, I'll add them > to the IEP. > > Denis > > ср, 4 апр. 2018 г. в 13:09, Vyacheslav Daradur <daradu...@gmail.com>: > > > Denis, thanks for the link. > > > > I looked through the task and I think that understand your redesign point > > now. > > > > Do you have a clear plan or IEP for the whole redesign? > > > > I'm interested in this component and I'd like to take part in the > > development. > > > > > > > > On Mon, Apr 2, 2018 at 2:55 PM, Denis Mekhanikov <dmekhani...@gmail.com> > > wrote: > > > Vyacheslav, > > > > > > Service deployment design, based on replicated utility cache has proven > > to > > > be unstable and deadlock-prone. > > > You can find a list of JIRA issues, connected to it, in my previous > > letter. > > > > > > The intention behind it is similar to the binary metadata redesign, > that > > > happened in the following ticket: IGNITE-4157 > > > <https://issues.apache.org/jira/browse/IGNITE-4157> > > > This change in service deployment procedure will eliminate need for > > another > > > internal replicated cache > > > and make service deployment more reliable on unstable topology. > > > > > > Denis > > > > > > вт, 27 мар. 2018 г. в 23:21, Vyacheslav Daradur <daradu...@gmail.com>: > > > > > >> Hi, Denis Mekhanikov! > > >> > > >> As far as I know, Ignite services are based on IgniteCache and we have > > >> all its features. We can use listeners or continuous queries for > > >> deployment synchronizations. > > >> > > >> Why do you want using the discovery layer for that? > > >> > > >> One more thing: we can use baseline approach for services, that means > > >> *IgniteService.deploy()* returns ready to work service after > > >> deployment on baseline nodes and deploy to other nodes on demand, for > > >> example when deployed service's loading will be hight. > > >> > > >> About versioning, maybe there is sense to extend public API: > > >> IgniteServices.service(name, *version*)? > > >> > > >> At first deployment, we can compute service's hashcode (just for an > > >> example) and store it, after new deployment request for services with > > >> an existing name we will compute new service's hashcode and compare > > >> them if they have different hashcodes that we will deploy new service > > >> as service with a different version. > > >> > > >> > > >> On Fri, Mar 23, 2018 at 10:03 PM, Denis Magda <dma...@apache.org> > > wrote: > > >> > Denis, > > >> > > > >> > Thanks for the extensive analysis. There is a vast room for > > optimizations > > >> > on the service grid side. > > >> > > > >> > Yakov, Sam, Alex G., > > >> > > > >> > How do you like the idea of the usage of discovery protocol for the > > >> service > > >> > grid system messages exchange? Any pitfalls? > > >> > > > >> > > > >> > -- > > >> > Denis > > >> > > > >> > > > >> > On Fri, Mar 23, 2018 at 8:01 AM, Denis Mekhanikov < > > dmekhani...@gmail.com > > >> > > > >> > wrote: > > >> > > > >> >> Igniters, > > >> >> > > >> >> I'd like to start a discussion on Ignite service grid redesign. > > >> >> We have a number of problems in our current architecture, that have > > to > > >> be > > >> >> addressed. > > >> >> > > >> >> Here are the most severe ones: > > >> >> > > >> >> One of them is lack of guarantee, that service is successfully > > deployed > > >> and > > >> >> ready for work by the time, when *IgniteService.deploy*()* methods > > >> return. > > >> >> Furthermore, if an exception is thrown from *Service.init() > *method, > > >> then > > >> >> the deploying side is not able to receive it, or even understand, > > that > > >> >> service is in unusable state. > > >> >> So, you may end up in such situation, when you deployed a service > > >> without > > >> >> receiving any errors, then called a service's method, and hung > > >> indefinitely > > >> >> on this invocation. > > >> >> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392 > > >> >> > > >> >> Another problem is locking during service deployment on unstable > > >> topology. > > >> >> This issue is caused by missing updates in continuous query > > listeners on > > >> >> the internal cache. > > >> >> It is hard to reproduce, but it happens sometimes. We shouldn't > allow > > >> such > > >> >> possibility, that deployment methods hang without saying anything. > > >> >> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259 > > >> >> > > >> >> I think, we should change the deployment procedure to make it more > > >> >> reliable. > > >> >> Moving from operating over internal replicated service cache to > > sending > > >> >> custom discovery events seems to be a good idea. > > >> >> Service deployment may trigger a discovery event, that will make > > chosen > > >> >> nodes deploy the service, and the same event will notify other > nodes > > >> about > > >> >> the deployed service instances. > > >> >> It will eliminate the need for distributed transactions on the > > internal > > >> >> replicated system cache, and make the service deployment protocol > > more > > >> >> transparent. > > >> >> > > >> >> There are a few points, that should be taken into account though. > > >> >> > > >> >> First of all, we can't wait for services to be deployed and > > initialised > > >> in > > >> >> the discovery thread. > > >> >> So, we need to make notification about service deployment result > > >> >> asynchronous, presumably over communication protocol. > > >> >> I can think of a procedure similar to the current exchange > protocol, > > >> when > > >> >> service deployment is initialised with an initial discovery > message, > > >> >> followed by asynchronous notifications from the hosting servers > over > > >> >> communication. And finally, one more discovery message will notify > > all > > >> >> nodes about the service deployment result and location of the > > deployed > > >> >> service instances. Coordinator will be responsible for collecting > of > > the > > >> >> deployment results in this scheme. > > >> >> > > >> >> Another problem is failover in case, when some nodes fail during > > >> deployment > > >> >> or further work. > > >> >> The following cases should be handled: > > >> >> > > >> >> 1. coordinator failure during deployment; > > >> >> 2. failure of nodes, that were chosen to host the service, > during > > >> >> deployment; > > >> >> 3. failure of nodes, that contain deployed services, after the > > >> >> deployment. > > >> >> > > >> >> The first case may be resolved by either continuation of deployment > > >> with a > > >> >> new coordinator, or by cancelling it. > > >> >> The second case will require another node to be chosen and > notified. > > >> Maybe > > >> >> another discovery message will be needed. > > >> >> The third case will require redeployment, so coordinator should > track > > >> >> topology changes and redeploy failed services. > > >> >> > > >> >> Another good improvement would be service versioning. This matter > was > > >> >> already discussed in another thread: > > >> >> > > >> > > http://apache-ignite-developers.2346864.n4.nabble. > com/Service-versioning- > > >> >> td20858.html > > >> >> Let's resume this discussion and state the final decision here. > > >> >> This feature is closely connected to peer class loading, which is > not > > >> >> working for services currently. > > >> >> So, service versioning should be implemented along with peer class > > >> loading. > > >> >> JIRA ticket for versioning: > > >> >> https://issues.apache.org/jira/browse/IGNITE-6069 > > >> >> Peer class loading: https://issues.apache.org/ > jira/browse/IGNITE-975 > > >> >> > > >> >> Please share your thoughts. Constructive criticism is highly > > >> appreciated. > > >> >> > > >> >> Denis > > >> >> > > >> > > >> > > >> > > >> -- > > >> Best Regards, Vyacheslav D. > > >> > > > > > > > > -- > > Best Regards, Vyacheslav D. > > >