[
https://issues.apache.org/jira/browse/ACCUMULO-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Elser updated ACCUMULO-1719:
---------------------------------
Fix Version/s: (was: 1.7.0)
1.8.0
> Convenient instanceName to instanceID mapping is unnecessary
> ------------------------------------------------------------
>
> Key: ACCUMULO-1719
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1719
> Project: Accumulo
> Issue Type: Improvement
> Components: client
> Reporter: Christopher Tubbs
> Fix For: 1.8.0
>
>
> ZooKeeperInstance constructor typically takes two parameters: instanceName
> and a comma separated list of zookeeper host[:port] (there's some others
> also, that take a UUID and/or a timeout setting).
> Initialize generates a UUID and associates a user-provided instanceName to
> it, with the following mapping in ZooKeeper:
> /accumulo/instances/instanceName, which contains a UUID, which points to
> /accumulo/UUID
> Since the introduction of instance.secret, there are potential problems with
> this mapping.
> If /accumulo (and /accumulo/instances and /accumulo/instances/instanceName)
> is created by Initialize in a write-protected way (using instance.secret),
> then re-initializing with a new generated instanceID but the same
> instanceName will not work unless the new instance has the same instance
> secret. This is very limiting and can be a nightmare for system
> administrators and developers trying to re-initialize.
> If it is not created in a write-protected way, there's an even bigger
> problem, because anybody with access to ZooKeeper can overwrite the old
> mapping to point to a new instance (and we expect all clients to be able to
> access ZooKeeper). While the old data is still protected, any clients
> connecting with the instanceName will connect (and ingest to) the new
> instanceID that the instanceName currently maps to.
> The current implementation appears to be using the former... (the
> instanceName node itself is protected by the same secret as the instanceId
> and child nodes). This means that at least the mapping is protected from
> being overwritten... but it also means that it doesn't provide us with any
> added value. Even if we're counting the added value of being able to
> reinitialize the same instanceName (generating a new instanceID), leaving the
> old instance data around for inspection, we've got the problems of ZK filling
> up and the fact that the mapping was re-written, we can't tell which old
> instanceID was the previous one to inspect.
> A better solution:
> Drop the mapping. It is unnecessary complex with no added value. Allow the
> instanceName that users create in new versions to represent the unique ID.
> Don't generate/use UUIDs anymore... use the provided instanceName. Keep the
> API for UUID... but just for convenience (treat it like a string internally).
> We can still prompt to overwrite the old instance... if it exists AND we have
> the same secret... but when we "overwrite it", we can optionally rename the
> old instanceName to instanceName_backup_date.
> Dropping the mapping has the benefit of reduced complexity, and (mostly)
> backwards-compatible (instances can't have the name "instances"). It is
> easier on developers to debug their instances, because there's no obscure
> UUID to deal with (unless they want to use that as the name) and they can
> find the old versions of their instances if they choose to back up the old
> data when re-initalizing. If not, they can avoid ZK filling up (esp. in dev
> environments where instanceNames get reused often). And, with a backup naming
> convention, it's easy for admins to decide which old instance data to keep
> and which to throw away... without the need of a mapping. The scope for the
> instance.secret is also well-defined to just the /accumulo/instanceName that
> created it, and there's no possibility of overwriting the instanceName to
> instanceID mapping.
> Instance names work best when unique. Instance IDs are guaranteed to be
> unique. There's no good reason these should be separate things.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)