[ 
https://issues.apache.org/jira/browse/ACCUMULO-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359594#comment-14359594
 ] 

Josh Elser commented on ACCUMULO-3598:
--------------------------------------

I was thinking about how to address this. The approach I came up with was to 
make two new znodes: one to track the rfile version and another to track the 
wal version. The master would be the coordinator that updates the value for 
these nodes and tservers could watch them. When the tserver sees the change, 
they can flip over to writing the newer version. The async-nature of the 
propagation via ZK is unimportant because these new servers should also be able 
to read the old files.

This approach is, however, predicated on the assumption that we can construct a 
writer specific for a version (e.g. {{RFile.newWriter(8)}}). This would let us 
provide a simple API through the master to coordinate this, akin to a {{hdfs 
dfsadmin -finalizeUpgrade}}. After the rolling restart/upgrade of all of the 
nodes is complete, the servers can be switched over to using the new file 
formats with a single API call.

> Address varied file versions by servers over RPC layer
> ------------------------------------------------------
>
>                 Key: ACCUMULO-3598
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3598
>             Project: Accumulo
>          Issue Type: Sub-task
>          Components: tserver
>            Reporter: Josh Elser
>             Fix For: 1.7.0
>
>
> There's an issue of handling newer versions of RFile and WALs in the middle 
> of a rolling restart.
> 1. Server1 is restarted as the new version
> 2. Server1 writes some new data
> 3. Server1 dies
> 4. Server2 (still old version) gets the tablets from Server1
> We need to ensure that there is control to limit the new software from 
> writing out new versions of persistent files while there are still old 
> versions of the software participating in the instance. It's similar to 
> finalizing an upgrade: after we're sure that all of the servers have been 
> upgraded and are functioning well, we can flip them over to using new 
> messages/serialization that the old versions aren't aware of.
> This problem gets much easier after we adopt Thrift/PB for serializing things 
> because both of those can naturally read newer versions of messages they know 
> about, ignoring the new fields.
> Ideally, we should define an API which rolling restart (ACCUMULO-1454) can 
> leverage, but there are many ways we could go about the "feature".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to