Hi Karthikeyan, Sorry for the delayed response. Currently LQE does the following:
1. It processes the change log looking for the change event it processed last. If LQE fails to find the last processed change event and reaches the end of the change log, LQE reports a Truncated Change Log error. 2. If LQE finds a change event older than the last processed, before finding the last processed, LQE reports a Change Log Rollback Detected error. How about having LQE process until the previously processed event OR any lower event to the last processed event? My limited understanding of Focal Point TRS issue is that it spuriously ends up stamping a change event (existing or new?) with an invalid trs:order. As such this is an invalid condition and must not occur. The sequence number for existing entries must be maintained. And sequence number for newer entries must be greater than the previous ones. It would help to know a bit more about the overall state of the Focal Point change log, when this situation occurs. We can use that as an example to understand if there is a way around it. Perhaps we work through it offline. Regards Vivek From: Karthikeyan Dakshinamurthy <[email protected]> To: Arthur Ryman <[email protected]>, Cc: [email protected], [email protected], Vivek Garg/Cupertino/IBM@IBMUS Date: 06/24/2013 01:16 AM Subject: Re: [oslc-core] [Lifecycle-query-workgroup] TRS 2.0 Specification - Rollback Behavior Hi Vivek, Currently, LQE process the events until it encounters the last processed order-number -- failing to find exact last processed event it throws roll-back error. How about having LQE process until the previously processed event OR any lower event to the last processed event? Or is this going to be an issue? Regards, Karthikeyan Dakshinamurthy (Karthi) Development Lead, Portfolio Strategy Management Phone: 91-80-4177-6161 | Mobile: 91-9972032927 E-mail: [email protected] | Office: EGL D Block, Outer Ring Rd., Bangalore, India 560071 From: Arthur Ryman <[email protected]> To: Vivek Garg <[email protected]> Cc: [email protected], [email protected], Oslc-Core <[email protected]>, [email protected] Date: 21-06-13 09:23 PM Subject: Re: [oslc-core] [Lifecycle-query-workgroup] TRS 2.0 Specification - Rollback Behavior Sent by: "Oslc-Core" <[email protected]> Vivek, It is not acceptable to simply re-index or remove a data source just because of some bad data that LQE interprets as a rollback. This can take days. It's simply not an option if we expect LQE to be used in production on very large data sets. The current behavior is too unstable. A human admin must be shown the alleged bad events and be allowed to make an informed decision, which must include the option to proceed by setting a new cutoff event. LQE should help by presenting the alleged rollback condition in context as clearly as possible. The human admin may be able to correct the problem at the data source. For example, he may be able to create some new change events so that any skipped resources get re-indexed. Regards, ___________________________________________________________________________ Arthur Ryman DE, Chief Architect, Reporting & Portfolio and Strategy Management IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: Vivek Garg/Cupertino/IBM@IBMUS To: Arthur Ryman <[email protected]>, Cc: Benjamin Williams <[email protected]>, [email protected], [email protected], [email protected], "Oslc-Core" <[email protected]> Date: 06/21/2013 11:35 AM Subject: Re: [oslc-core] [Lifecycle-query-workgroup] TRS 2.0 Specification - Rollback Behavior Few comments and questions: 1. Current behavior: On a change log processing cycle, LQE scans the change log pages, looking for the change event it last processed (on a previous change log processing cycle). If LQE encounters a change event older than the event it last processed, before finding the last processed, LQE treats it as a rollback condition. In such cases, LQE essentials halts and waits for Admins input via UI (it is not an automatic re-index). The UI currently offers two recommended actions for the admin in such cases: Re-Index data source or Remove data source. Currently there is no Ignore action offered. 2. Current behavior: LQE currently does not retain the change event history locally, for it to perform an undo in case of a rollback. This appears a good enhancement for us to make in a future release of LQE. 3. Arthur you mentioned Ignore as another possible action to be offered to the administrator. What is the Ignore behavior from a client perspective? Also is the need for Ignore action still valid if Focal Point's issue with race condition was fixed (or any issues in the TRS spec that make it hard to implement the spec) ? Regards Vivek From: Arthur Ryman <[email protected]> To: Benjamin Williams <[email protected]>, Cc: [email protected], [email protected], [email protected] Date: 06/21/2013 10:11 AM Subject: Re: [oslc-core] [Lifecycle-query-workgroup] TRS 2.0 Specification - Rollback Behavior Sent by: "Oslc-Core" <[email protected]> Ben, Yes, if the server is rolled back, the the index should react so that is mirrors the actual state of the server. The index might do that efficiently if it stored change events. In the worst case (and the normal case) if re-indexes from scratch, which can take days. My top priority would be so improve the admin UI so that an admin user can manually correct or override the problem, e.g. simply ignore it so LQE proceeds. In parallel, the admin can touch resources on the server to force them to get re-indexed later. We need to avoid a full re-index. Regards, ___________________________________________________________________________ Arthur Ryman DE, Chief Architect, Reporting & Portfolio and Strategy Management IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: Benjamin Williams <[email protected]> To: Arthur Ryman/Toronto/IBM@IBMCA, Cc: [email protected], [email protected], [email protected] Date: 06/13/2013 06:06 AM Subject: Re: [Lifecycle-query-workgroup] TRS 2.0 Specification - Rollback Behavior Arthur Is it true that if a server performs a rollback then the desired state of the index is to reflect the rolled-back state of indexed resources? In terms of desired outcome I would prioritise as below: 1. Client detects a rollback (either through detecting change log inconsistencies or through an explicit trs:Rollback event) and processes the delta based on local history record 2. Client detects a rollback (either through detecting change log inconsistencies or through an explicit trs:Rollback event) and - due to absence of local history - halts and waits for admin intervention to select re-index or ignore 3. Client detects a rollback (either through detecting change log inconsistencies or through an explicit trs:Rollback event) and - due to absence of local history - proceeds with ignore 4. Client detects a rollback (either through detecting change log inconsistencies or through an explicit trs:Rollback event) and - due to absence of local history - proceeds with re-index In all cases, a trs:Rollback event would seem a desirable addition, however I'm not sure of the real value, as most server rollbacks would likely be at the entire server/OS level and so the server would not be aware it had been rolled back in order to issue the event. With #1 being the optimal outcome, is there any guidance or recommendations regarding client implementations 'retaining a local record of previously processed events'? Regards, Ben Williams Senior Product Manager IBM Rational Systems Engineering Phone: 44-1344 443020 E-mail: [email protected] Find me on: and within IBM on: 5 Guillemot Street Bracknell, Berkshire RG12 8ER United Kingdom IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU From: Arthur Ryman <[email protected]> To: [email protected], Cc: [email protected] Date: 12/06/2013 19:13 Subject: [Lifecycle-query-workgroup] TRS 2.0 Specification - Rollback Behavior Sent by: [email protected] The TRS spec mentions server rollbacks in several places, but never defines what these are. A definition should be added. There is actually no concrete representation for a rollback event. Instead, a server rollback is inferred when the client detects certain conditions. The spec [1] has the following text: "In the (hopefully rare) situation that the Client fails to find its sync point event, one of two things is likely to have happened on the Server: either the Server has truncated its Change Log, or the Server has been rolled back to an earlier state. If the Client had been retaining a local record of previously processed events, the Client may be able to detect a Server rollback if it notices the successor event of some previously processed event has been removed or changed to one with a different identifier than before." My dev team is working with a client implementation of the TRS spec (LQE) that interprets certain contains in the TRS feed as indicating a rollback event, and then re-indexes the entire data source. This behavior is undesirable since indexing a large data source can take days, during which time users can't get accurate query results. I recommend that we expand the guidance for how TRS clients should respond to an inferred rollback event. There should be other less disruptive courses of action. In some cases the rollback event is caused by other factors. We have observed that the spec is difficult to implement unless the server maintains certain information, e.g. a record of each change. In our experience, we have never actually rolled back our server, but due to race conditions we occasionally produce a change log that appears to contain a rollback event. The alternate responses to a rollback include: 1. ignore - the client continues to process the change log and makes a sensible guess about where to cut off, e.g. by remembering some information from the previous change log 2. halt - the client stops processing and waits for an administration to explicitly select the next action which could be ignore or re-index The client should be configured with a suitable policy, e.g. ignore, halt, or re-index, and have an admin interface so that a human administrator can take the best course of action. In any case, a unilateral automatic decision to re-index is problematic. Another way to deal with rollback events is to add a new type of event to the change log, i.e. a trs:Rollback event. Only when this event is received should a client re-index. Minor point: the text of the specification should not use both the terms "cutoff event" and "synch point". Let's pick one and use it throughout. Regards, ___________________________________________________________________________ Arthur Ryman DE, Chief Architect, Reporting & Portfolio and Strategy Management IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) _______________________________________________ Lifecycle-query-workgroup mailing list [email protected] http://mailman.hursley.ibm.com/mailman/listinfo/lifecycle-query-workgroup Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net
