[
https://issues.apache.org/jira/browse/CASSANDRA-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736619#comment-14736619
]
Sam Tunnicliffe commented on CASSANDRA-9761:
--------------------------------------------
I kind of like the idea of rejecting any role operations until setup is
completed, as opposed to just advising against it as we did previously.
Unfortunately, this breaks the migration path as the conversion from users to
roles uses the public {{createRole}} method and so also the new {{process}}.
Perhaps we could just change the semantic slightly from {{isSetup}} to
{{isClusterReady}} and set that to true as soon as the {{areAllNodesAtLeast22}}
check returns true, before the setup task is executed.
This isn't important, just a potential improvement, but seeing as we're now
rescheduling the setup task when the cluster is not ready, perhaps we could do
the same on other failuire scenarios. e.g. if {{convertLegacyData}} or
{{setupDefaultRole}} runs but fails, only a restart will trigger a re-attempt.
Of course, the primary reason for either of those to fail previously was being
run on a mixed cluster, so it's far less likely to happen. Still, it seems like
some low-hanging fruit.
I've pushed a couple of commits for the above
[here|https://github.com/beobal/cassandra/tree/9761].
> Delay auth setup until peers are upgraded
> -----------------------------------------
>
> Key: CASSANDRA-9761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9761
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sam Tunnicliffe
> Assignee: Sylvain Lebresne
> Fix For: 3.0.0 rc1, 2.2.2
>
>
> The built in auth classes {{CassandraRoleManager}} and
> {{CassandraAuthorizer}} both attempt to do some setup and data conversion
> when a node is upgraded to version 2.2 or higher. At the moment, each node
> attempts the operations with the expectation that this will fail until enough
> of the cluster has been upgraded for it to succeed (i.e. enough nodes have
> the latest schema with the requisite new tables). These expected failures are
> largely harmless, but they are annoying because they cause the receiving node
> (the non-upgraded node) to close the connection with the upgraded node, which
> then has to be restablished. Although this is the normal behaviour on schema
> disagreement (see CASSANDRA-9136 for further discussion), it may be possible
> to avoid in this specific circumstance. Given that we expect the operations
> to fail until enough nodes are upgraded, we could defer them until we're sure
> they can succeed by checking the messaging service version of peers.
> Right now these are a one shot thing, each node only makes one attempt at the
> conversion (until it is restarted). Without investigating further, I don't
> know if we'd need to add in retries in case it takes a little time for each
> peer's MS version to be updated as they're upgraded. The setup & conversion
> operations are idempotent, so there shouldn't be a great issue if several
> nodes attempt them at the same time anyway.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)