[ 
https://issues.apache.org/jira/browse/FALCON-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15690243#comment-15690243
 ] 

Ajay Yadava commented on FALCON-1406:
-------------------------------------

{quote} a lot of features developed in falcon took the immutability of *entity 
definition* for past instances as given {quote}

Rerunning old instances doesn't affect that. New entities create new instances 
and don't affect "old instances", though they might overwrite the result of 
other entities.


{quote}This is a much cleaner way of retaining history than the current scheme 
{quote}
I understand and agree with the motivation, but IMHO, this approach pollutes 
history. The fact that the overlapping instances ran and had a status and are 
selectively nuked with this change, is seemingly clean but is actually creating 
more problems than solving. A better approach in retaining history will be to 
create entity versioning.

The workaround approach that you have suggested now, is the one which doesn't 
leave "mess" (defunct and similarly named entities) so you don't need a 
cleanup. However, there are a lot of other cases where there will be such 
"mess" and a tool can definitely be built to highlight such entities, older 
than a given time range, and hence can be deleted. It will also be useful if 
someone takes the approach of not deleting and recreating the entity but to 
update the entity and reprocess old instances with a backfill job. Both 
approaches have their own pros and cons and none is ideal.


> Effective time in Entity updates.
> ---------------------------------
>
>                 Key: FALCON-1406
>                 URL: https://issues.apache.org/jira/browse/FALCON-1406
>             Project: Falcon
>          Issue Type: New Feature
>            Reporter: sandeep samudrala
>            Assignee: sandeep samudrala
>         Attachments: FALCON-1406-initial.patch, 
> effective_time_in_entity_updates.pdf
>
>
> Effective time with entity updates needs to be provided even with past time 
> too. There was effective time capability provided in the past which gives the 
> functionality to set an effective time for an entity with only current or 
> future time(now + delay), which could not solve all the issues. 
> Following are few scenarios which would require effective time to be 
> available with time back in past.
> a) New code being deployed for an incompatible input data set which would 
> leave instances with old code and new data.
> b) Bad code being pushed for which, the entity should be able to go back in 
> time to replay(rerun) with new code.
> c) Orchestration level changes(good/bad) would need functionality to go back 
> in time to start with.
> For reference: Linking all the Jiras that have been worked upon around 
> effective time .
> https://issues.apache.org/jira/browse/FALCON-374
> https://issues.apache.org/jira/browse/FALCON-297



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to