[ 
https://issues.apache.org/jira/browse/FALCON-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691996#comment-15691996
 ] 

Srikanth Sundarrajan commented on FALCON-1406:
----------------------------------------------

Thanks [~ajayyadava], Wanted to bring the notes from the discussion back into 
the jira for others to also chime in.

+*1. Motivation for the feature*+
There is a need for entity update effective in the past due to issues relating 
to code issues, data schema changes that are retro effective. There is also a 
need for a clean way to do this without soiling the system with newer temporary 
entities or other hacks. There is generally an agreement acknowledging the 
utility of the feature.

+*2. Versioning of Entities*+
Would versioning of entities be a better way to handle this generically and 
more cleanly. The points discussed around these were 
  * Would it make sense to track and maintain versioning for feeds and what 
would be the challenges for the consumers of data/feed to depend if feed was 
versioned
  * If entities are versioned, would all the APIs and hence the end users will 
be version aware in all the operation 
  * Would versioning solve this problem more cleanly and if so how

This is what we felt would be good answers to these questions.
  * Versioning of feed would indeed make it difficult and challenging for the 
consumers. The way processes depend on the latest definition of the feed at the 
time of its execution seemed the right approach (Lifecycle action execution 
would still benefit from versioning, more on that later)
  * Processes on the other hand would benefit from versioning as there is code 
associated with it. There are a number of ways to look at the versioning 
scheme. If time (loosely effective time) were to be a equivalent of a version 
then the current feature does allow for a rudimentary versioning scheme. But 
the fact is that the rest of the system particularly the config store has to be 
version aware (regardless of the version scheme). If any of the sub services 
within the Falcon system were to use the definition of the entity and build out 
further capabilities, then those have to be version aware as well (for ex. SLA 
monitoring, alerting and likes). While the system itself has the ability to 
track the version / history of the entity, it didn't seem right to burden the 
users (or the APIs) to be version aware. It would be helpful to retain the 
current semantics. However Definition listing, Feed instance availability, 
Dependency APIs would benefit from being version aware.

+*3. Known and unknown gaps*+
  * There are many sub services particularly on the instance start/finish path 
that may be broken if not handled correctly with this change
  * Scheduled feeds can have similar problems such as processes as we can 
choose to make an update retroactively.

+*4. Way forward*+
  * Design document to identify the gaps relating to other affected components 
with the effective time and particularly if the approach to treat entities as 
versioned, what changes would these entail 
  * Identify and file associated JIRA related to these gaps and address them. 
  * As a community we can then review and ensure all known gaps are covered in 
the design document and issues are tracked.

Request [~ajayyadava] to chime in with missing details of if any details are 
misrepresented.

> Effective time in Entity updates.
> ---------------------------------
>
>                 Key: FALCON-1406
>                 URL: https://issues.apache.org/jira/browse/FALCON-1406
>             Project: Falcon
>          Issue Type: New Feature
>            Reporter: sandeep samudrala
>            Assignee: sandeep samudrala
>         Attachments: FALCON-1406-initial.patch, 
> effective_time_in_entity_updates.pdf
>
>
> Effective time with entity updates needs to be provided even with past time 
> too. There was effective time capability provided in the past which gives the 
> functionality to set an effective time for an entity with only current or 
> future time(now + delay), which could not solve all the issues. 
> Following are few scenarios which would require effective time to be 
> available with time back in past.
> a) New code being deployed for an incompatible input data set which would 
> leave instances with old code and new data.
> b) Bad code being pushed for which, the entity should be able to go back in 
> time to replay(rerun) with new code.
> c) Orchestration level changes(good/bad) would need functionality to go back 
> in time to start with.
> For reference: Linking all the Jiras that have been worked upon around 
> effective time .
> https://issues.apache.org/jira/browse/FALCON-374
> https://issues.apache.org/jira/browse/FALCON-297



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to