Re: Service grid redesign

Dmitriy Setrakyan Wed, 04 Apr 2018 12:21:12 -0700

Here is the correct link:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-17%3A+Oil+Change+in+Service+Grid


I have looked at the tickets there, and I believe that we should not
support peer-deployment for services. It is very hard and I do not think we
should even try.

I am proposing closing this ticket as Won't Fix -
https://issues.apache.org/jira/browse/IGNITE-975

D.

On Wed, Apr 4, 2018 at 5:39 AM, Denis Mekhanikov <[email protected]>
wrote:

> Vyacheslav,
>
> I've just posted my first draft of the IEP:
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-17%3A+Service+grid+
> improvements
> It's not finished yet, but you can get the idea from it.
> If you have some thoughts on your mind, please let me know, I'll add them
> to the IEP.
>
> Denis
>
> ср, 4 апр. 2018 г. в 13:09, Vyacheslav Daradur <[email protected]>:
>
> > Denis, thanks for the link.
> >
> > I looked through the task and I think that understand your redesign point
> > now.
> >
> > Do you have a clear plan or IEP for the whole redesign?
> >
> > I'm interested in this component and I'd like to take part in the
> > development.
> >
> >
> >
> > On Mon, Apr 2, 2018 at 2:55 PM, Denis Mekhanikov <[email protected]>
> > wrote:
> > > Vyacheslav,
> > >
> > > Service deployment design, based on replicated utility cache has proven
> > to
> > > be unstable and deadlock-prone.
> > > You can find a list of JIRA issues, connected to it, in my previous
> > letter.
> > >
> > > The intention behind it is similar to the binary metadata redesign,
> that
> > > happened in the following ticket: IGNITE-4157
> > > <https://issues.apache.org/jira/browse/IGNITE-4157>
> > > This change in service deployment procedure will eliminate need for
> > another
> > > internal replicated cache
> > > and make service deployment more reliable on unstable topology.
> > >
> > > Denis
> > >
> > > вт, 27 мар. 2018 г. в 23:21, Vyacheslav Daradur <[email protected]>:
> > >
> > >> Hi, Denis Mekhanikov!
> > >>
> > >> As far as I know, Ignite services are based on IgniteCache and we have
> > >> all its features. We can use listeners or continuous queries for
> > >> deployment synchronizations.
> > >>
> > >> Why do you want using the discovery layer for that?
> > >>
> > >> One more thing: we can use baseline approach for services, that means
> > >> *IgniteService.deploy()* returns ready to work service after
> > >> deployment on baseline nodes and deploy to other nodes on demand, for
> > >> example when deployed service's loading will be hight.
> > >>
> > >> About versioning, maybe there is sense to extend public API:
> > >> IgniteServices.service(name, *version*)?
> > >>
> > >> At first deployment, we can compute service's hashcode (just for an
> > >> example) and store it, after new deployment request for services with
> > >> an existing name we will compute new service's hashcode and compare
> > >> them if they have different hashcodes that we will deploy new service
> > >> as service with a different version.
> > >>
> > >>
> > >> On Fri, Mar 23, 2018 at 10:03 PM, Denis Magda <[email protected]>
> > wrote:
> > >> > Denis,
> > >> >
> > >> > Thanks for the extensive analysis. There is a vast room for
> > optimizations
> > >> > on the service grid side.
> > >> >
> > >> > Yakov, Sam, Alex G.,
> > >> >
> > >> > How do you like the idea of the usage of discovery protocol for the
> > >> service
> > >> > grid system messages exchange? Any pitfalls?
> > >> >
> > >> >
> > >> > --
> > >> > Denis
> > >> >
> > >> >
> > >> > On Fri, Mar 23, 2018 at 8:01 AM, Denis Mekhanikov <
> > [email protected]
> > >> >
> > >> > wrote:
> > >> >
> > >> >> Igniters,
> > >> >>
> > >> >> I'd like to start a discussion on Ignite service grid redesign.
> > >> >> We have a number of problems in our current architecture, that have
> > to
> > >> be
> > >> >> addressed.
> > >> >>
> > >> >> Here are the most severe ones:
> > >> >>
> > >> >> One of them is lack of guarantee, that service is successfully
> > deployed
> > >> and
> > >> >> ready for work by the time, when *IgniteService.deploy*()* methods
> > >> return.
> > >> >> Furthermore, if an exception is thrown from *Service.init()
> *method,
> > >> then
> > >> >> the deploying side is not able to receive it, or even understand,
> > that
> > >> >> service is in unusable state.
> > >> >> So, you may end up in such situation, when you deployed a service
> > >> without
> > >> >> receiving any errors, then called a service's method, and hung
> > >> indefinitely
> > >> >> on this invocation.
> > >> >> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392
> > >> >>
> > >> >> Another problem is locking during service deployment on unstable
> > >> topology.
> > >> >> This issue is caused by missing updates in continuous query
> > listeners on
> > >> >> the internal cache.
> > >> >> It is hard to reproduce, but it happens sometimes. We shouldn't
> allow
> > >> such
> > >> >> possibility, that deployment methods hang without saying anything.
> > >> >> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259
> > >> >>
> > >> >> I think, we should change the deployment procedure to make it more
> > >> >> reliable.
> > >> >> Moving from operating over internal replicated service cache to
> > sending
> > >> >> custom discovery events seems to be a good idea.
> > >> >> Service deployment may trigger a discovery event, that will make
> > chosen
> > >> >> nodes deploy the service, and the same event will notify other
> nodes
> > >> about
> > >> >> the deployed service instances.
> > >> >> It will eliminate the need for distributed transactions on the
> > internal
> > >> >> replicated system cache, and make the service deployment protocol
> > more
> > >> >> transparent.
> > >> >>
> > >> >> There are a few points, that should be taken into account though.
> > >> >>
> > >> >> First of all, we can't wait for services to be deployed and
> > initialised
> > >> in
> > >> >> the discovery thread.
> > >> >> So, we need to make notification about service deployment result
> > >> >> asynchronous, presumably over communication protocol.
> > >> >> I can think of a procedure similar to the current exchange
> protocol,
> > >> when
> > >> >> service deployment is initialised with an initial discovery
> message,
> > >> >> followed by asynchronous notifications from the hosting servers
> over
> > >> >> communication. And finally, one more discovery message will notify
> > all
> > >> >> nodes about the service deployment result and location of the
> > deployed
> > >> >> service instances. Coordinator will be responsible for collecting
> of
> > the
> > >> >> deployment results in this scheme.
> > >> >>
> > >> >> Another problem is failover in case, when some nodes fail during
> > >> deployment
> > >> >> or further work.
> > >> >> The following cases should be handled:
> > >> >>
> > >> >>    1. coordinator failure during deployment;
> > >> >>    2. failure of nodes, that were chosen to host the service,
> during
> > >> >>    deployment;
> > >> >>    3. failure of nodes, that contain deployed services, after the
> > >> >>    deployment.
> > >> >>
> > >> >> The first case may be resolved by either continuation of deployment
> > >> with a
> > >> >> new coordinator, or by cancelling it.
> > >> >> The second case will require another node to be chosen and
> notified.
> > >> Maybe
> > >> >> another discovery message will be needed.
> > >> >> The third case will require redeployment, so coordinator should
> track
> > >> >> topology changes and redeploy failed services.
> > >> >>
> > >> >> Another good improvement would be service versioning. This matter
> was
> > >> >> already discussed in another thread:
> > >> >>
> > >>
> > http://apache-ignite-developers.2346864.n4.nabble.
> com/Service-versioning-
> > >> >> td20858.html
> > >> >> Let's resume this discussion and state the final decision here.
> > >> >> This feature is closely connected to peer class loading, which is
> not
> > >> >> working for services currently.
> > >> >> So, service versioning should be implemented along with peer class
> > >> loading.
> > >> >> JIRA ticket for versioning:
> > >> >> https://issues.apache.org/jira/browse/IGNITE-6069
> > >> >> Peer class loading: https://issues.apache.org/
> jira/browse/IGNITE-975
> > >> >>
> > >> >> Please share your thoughts. Constructive criticism is highly
> > >> appreciated.
> > >> >>
> > >> >> Denis
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> Best Regards, Vyacheslav D.
> > >>
> >
> >
> >
> > --
> > Best Regards, Vyacheslav D.
> >
>

Re: Service grid redesign

Reply via email to