[ 
https://issues.apache.org/jira/browse/OOZIE-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101861#comment-13101861
 ] 

Hadoop QA commented on OOZIE-346:
---------------------------------

topher-zicornell remarked:
Hi Tucu,

For the blob thing, I'd like to point out that you're trading non-optimizable 
Java work (in the serialization and deserialization stuff) for optimizable DB 
work (in read/write of the additional data).  And the single row locking can 
still be used to gate access to the bag table (as long as it's a one to many 
relationship).

For the version thing, logic being different...  I've been pondering this.  
Just to be clear, we're talking ONLY about deserialization for this; not the 
state-machine itself, right?  If so, the only logic I can think of is type 
conversion.  (I mentioned that up above.)

If the fields are all simple strings or containers of strings, what other logic 
might there be?

At any rate, I've shared all the bits I felt were important about this.  ;)

.  Topher

> GH-558: Serialization/deserialization of WorkflowInstance
> ---------------------------------------------------------
>
>                 Key: OOZIE-346
>                 URL: https://issues.apache.org/jira/browse/OOZIE-346
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Oozie team at yahoo has recently experienced multiple production issues when 
> a new oozie version is upgraded attributed to the modifications of Workflow 
> tables' structure.
> More specifically, we added a new field into workflow table. Hence, for 
> example, if a user submits a WF job in earlier oozie version and if the job 
> is still active after the upgrade, oozie fails to de-serialize the WFInstance 
> object. In other words, the object was originally serialized using the old 
> structure whereas oozie tries to deserailize using the new structures after 
> the upgrade. Therefore it throws exception.
> Some observations that came up from our internal discussion:
> 1. Is it required to store the blob into table? Can't we create the the 
> object from the other fields of the table? I know it might not be that 
> straight forward. However, other options might be worse than this.
> 2. If we want to keep the blob, the new field(s) should be added at the end 
> during serialization. However if some fields are removed, how could we handle 
> that? Might not be a flexible idea.
> 3. During serialization, we could use some type of version at the beginning, 
> that would help to de-serailize the object. This might make the coding very 
> ugly depending on how many old versions we would like to support.  
> 4. Since it is a very well-known problem, there should be some standard 
> procedure. However they might not be easy too.
> Anyway these are just the initial thoughts. We didn't come up in any 
> conclusion yet.
> Please feel free to make comment?
> Thanks,
> Mohammad

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to