Denis M. & Val please share your vision about this topic. пт, 24 авг. 2018 г. в 15:52, Vyacheslav Daradur <daradu...@gmail.com>:
> Nick, Antron, thank you for stepping in. > > AFAIK, Ignite cluster can move its state to a new version of Ignite > using persistence only. > > Since Ignite v.2.3 persistence is configured on a memory region and > system memory region is not persistence, that means the system > (utility) cache will not be recovered on cluster restart. > > Here is a ticket which describes the same issue: > https://issues.apache.org/jira/browse/IGNITE-6629 > > > BTW, Is proposed solution provides the guarantee that services will be > > redeployed after each cluster restart since now we're not using the > cache? > > No, only services described in IgniteConfiguration will be deployed at > node startup as well as now. > > Am I wrong in something? > On Thu, Aug 23, 2018 at 5:59 PM Anton Vinogradov <a...@apache.org> wrote: > > > > Vyacheslav. > > > > It looks like we able to restart all services on grid startup from old > > definitions (inside cache) in case persistence turned on. > > Se no problems to provide such automated migration case. > > Also, we can test it using compatibility framework. > > > > BTW, Is proposed solution provides the guarantee that services will be > > redeployed after each cluster restart since now we're not using the > cache? > > > > чт, 23 авг. 2018 г. в 15:21, Nikolay Izhikov <nizhi...@apache.org>: > > > > > Hello, Vyacheslav. > > > > > > Thanks, for sharing your design. > > > > > > > I have a question about services migration from AI 2.6 to a new > solution > > > > > > Can you describe consequences of not having migration solution? > > > What will happen on the user side? > > > > > > > > > В Чт, 23/08/2018 в 14:44 +0300, Vyacheslav Daradur пишет: > > > > Hi, Igniters! > > > > > > > > I’m working on Service Grid redesign tasks and design seems to be > > > finished. > > > > > > > > The main goal of Service Grid redesign is to provide missed > guarantees: > > > > - Synchronous services deployment/undeployment; > > > > - Failover on coordinator change; > > > > - Propagation of deployments errors across the cluster; > > > > - Introduce of a deployment failures policy; > > > > - Prevention of deployments initiators hangs while deployment; > > > > - etc. > > > > > > > > I’d like to ask the community their thoughts about the proposed > design > > > > to be sure that all important things have been considered. > > > > > > > > Also, I have a question about services migration from AI 2.6 to a new > > > > solution. It’s very hard to provide tools for users migration, > because > > > > of significant changes. We don’t use utility cache anymore. Should we > > > > spend time on this? > > > > > > > > I’ve prepared a definition of new Service Grid design, it’s described > > > below: > > > > > > > > *OVERVIEW* > > > > > > > > All nodes (servers and clients) are able to host services, but the > > > > client nodes are excluded from service deployment by default. The > only > > > > way to deploy service on clients nodes is to specify node filter in > > > > ServiceConfiguration. > > > > > > > > All deployed services are identified internally by “serviceId” > > > > (IgniteUuid). This allows us to build a base for such features as hot > > > > redeployment and service’s versioning. It’s important to have the > > > > ability to identify and manage services with the same name, but > > > > different version. > > > > > > > > All actions on service’s state change are processed according to > unified > > > flow: > > > > 1) Initiator sends over disco-spi a request to change service state > > > > [deploy, undeploy] DynamicServicesChangeRequestBatchMessage which > will > > > > be stored by all server nodes in own queue to be processed, if > > > > coordinator failed, at new coordinator; > > > > 2) Coordinator calculates assignments and defines actions in a new > > > > message ServicesAssignmentsRequestMessage and sends it over disco-spi > > > > to be processed by all nodes; > > > > 3) All nodes apply actions and build single map message > > > > ServicesSingleMapMessage that contains services id and amount of > > > > instances were deployed on this single node and sends the message > over > > > > comm-spi to coordinator (p2p); > > > > 4) Once coordinator receives all single map messages then it builds > > > > ServicesFullMapMessage that contains services deployments across the > > > > cluster and sends message over disco-spi to be processed by all > nodes; > > > > > > > > *MESSAGES* > > > > > > > > class DynamicServicesChangeRequestBatchMessage { > > > > Collection<DynamicServiceChangeRequest> reqs; > > > > } > > > > > > > > class DynamicServiceChangeRequest { > > > > IgniteUuid srvcId; // Unique service id (generates to deploy, > > > > existing used to undeploy) > > > > ServiceConfiguration cfg; // Empty in case of undeploy > > > > byte flags; // Change’s types flags [deploy, undeploy, etc.] > > > > } > > > > > > > > class ServicesAssignmentsRequestMessage { > > > > ServicesDeploymentExchangeId exchId; > > > > Map<IgniteUuid, Map<UUID, Integer>> srvcsToDeploy; // Deploy and > > > reassign > > > > Collection<IgniteUuid> srvcsToUndeploy; > > > > } > > > > > > > > class ServicesSingleMapMessage { > > > > ServicesDeploymentExchangeId exchId; > > > > Map<IgniteUuid, ServiceSingleDeploymentsResults> results; > > > > } > > > > > > > > class ServiceSingleDeploymentsResults { > > > > int cnt; // Deployed instances count, 0 in case of undeploy > > > > Collection<byte[]> errors; // Serialized exceptions to avoid > > > > issues at spi-level > > > > } > > > > > > > > class ServicesFullMapMessage { > > > > ServicesDeploymentExchangeId exchId; > > > > Collection<ServiceFullDeploymentsResults> results; > > > > } > > > > > > > > class ServiceFullDeploymentsResults { > > > > IgniteUuid srvcId; > > > > Map<UUID, ServiceSingleDeploymentsResults> results; // Per node > > > > } > > > > > > > > class ServicesDeploymentExchangeId { > > > > UUID nodeId; // Initiated, joined or failed node id > > > > int evtType; // EVT_NODE_[JOIN/LEFT/FAILED], > EVT_DISCOVERY_CUSTOM_EVT > > > > AffinityTopologyVersion topVer; > > > > IgniteUuid reqId; // Unique id of custom discovery message > > > > } > > > > > > > > *COORDINATOR CHANGE* > > > > > > > > All server nodes handle requests of service’s state changes and put > it > > > > into deployment queue, but only coordinator process them. If > > > > coordinator left or fail they will be processed on new coordinator. > > > > > > > > *TOPOLOGY CHANGE* > > > > > > > > Each topology change (NODE_JOIN/LEFT/FAILED event) causes service's > > > > states deployment task. Assignments will be recalculated and applied > > > > for each deployed service. > > > > > > > > *CLUSTER ACTIVATION/DEACTIVATION* > > > > > > > > - On deactivation: > > > > * local services are being undeployed; > > > > * requests are not handling (including deployment / > undeployment); > > > > - On activation: > > > > * local services are being redeployed; > > > > * requests are handling as usual; > > > > > > > > *RELATED LINKS* > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-17%3A+Oil+Change+in+Service+Grid > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Service-grid-redesign-td28521.html > > > > > > > > > > > > -- > Best Regards, Vyacheslav D. >