[ 
https://issues.apache.org/jira/browse/CASSANDRA-17044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433642#comment-17433642
 ] 

Alex Petrov commented on CASSANDRA-17044:
-----------------------------------------

Thank you for the patch and clarifications. I like the overall direction 
towards modularisation and interfaces. My comments are following.
 * Even though I generally like that {{SchemaChangeListener}} is now an 
interface rather than abstract class, I’m concerned about potential users who 
might have relied on it being implemented the way it was. Given this was a 
useful interface, I think we should be more careful with it. I think it’s 
possible to bring backwards compatibility back by creating a wrapper class with 
the old interface. Would still require recompilation for the users, but could 
will make it painless to migrate.
 * Is there any reason we wouldn’t want to make the maps in 
{{TableMetadataRefCache}} immutable? I realise they can be sizeable, but in 
this case we could also hide them behind the interface, make default 
implementation immutable and let other implementations be more efficient if 
necessary. This way we can atomically swap/publish to all three maps.
 * I’m not sure about the name {{SharedSchema}} class and nomenclature:
 ** Class itself, if the whole point was to add a version to Keyspaces, maybe 
we should just add the version to Keyspaces, and make sure keyspaces are being 
updated synchronously with the version. Moreover, I think having version here 
and version left in {{SchemaManager}} is making it more confusing. I realise 
that you’ve tried to hide the version for {{SharedSchema}} in the Default 
handler, it’s unclear whose responsibility the version actually is.
 ** Nomenclature: I don’t think this schema is “shared” in any way: all 
keyspaces that aren’t local are just nonlocal, and a subset of them is just 
“userDefined”. Maybe we could use “replicated” or “distributed” instead of 
“nonlocal”.
 * Is there anything that justifies the existing of {{LocalKeyspaces}} class 
and prevents it from being just an instance of {{Keyspaces}} which is more 
functional but seems to be at least equivalent for its current usages?
 * {{reloadSchemaAndAnnounceVersion}} seems to pick all keyspaces and then 
filter out all system/local ones. Maybe we can just use “shared” (in 
nomenclature of your patch)
 * in {{SchemaDiagnostics}}, “schemata” doesn’t seem to be a typo, but rather a 
plural of “schema”. There are a few other mentions of schemata in this file, so 
I’d check it again.
 * Should we make a separate implementation of 
{{IEndpointStateChangeSubscriber}} within {{SchemaUpdateHandler}}, since
 * There seems to be an unintentional use of DSE 
[here|[https://github.com/apache/cassandra/pull/1270/commits/42a825a98fb4d2564508463911f43fd2754051d6#diff-fcad6f463c8c500997d27cceced47b8209088e21df328f8bc8f75507bc92b3eaR97]].
 I do not mind the presence of this method though even though it seems to be 
unused.
 * Nit: [CASSANDRA-17044: Refactor schema management by jacek-lewandowski · 
Pull Request #1270 · apache/cassandra · GitHub|#1270 · apache/cassandra · 
GitHub]([https://github.com/apache/cassandra/pull/1270/commits/42a825a98fb4d2564508463911f43fd2754051d6#diff-67531e8854562a763e98074cadaeee6d7c7c288a30ebef9fa465e249acab9fb2R115])
 here probably you have meant “local keyspace definitions” or something similar.
 * Unfortunately, I could not find a full set of passing tests accompanying the 
patch. Could you, when posting the updates, make sure to include a link to 
CircleCI?

Lastly, I feel like for a 4k lines of code patch, this one has surprisingly few 
tests. I realise that schema was always a generally undertested area of code, 
but I think our current quality standards do not allow us to do massive 
refactoring without substantial testing. Therefore, I suggest we create fuzz 
test that verifies schema eventual propagation, reads/writes during schema 
changes, and cross-node consistency of schema operations themselves, including 
concurrent ones. What seems to be particularly important to test is the gossip 
propagation of the version ID, which would lead to schema pulls, alongside with 
active schema pushes that is done when there’s a non-local schema change, and 
picking up schema during bootstrap.

Also, two comments about the scope of the patch, since the patch seems to have 
included several changes that seem to be unrelated to refactoring, and are 
rather bugfixes or performance improvements: 
 * 631fa2939708dd3cbd90f26442a7e167a3386cd7 is touching things that are used in 
almost every subsystem in the code. {{Keyspace.open()}} is a super common call. 
I think changes to such method deserves extensive testing, especially if 
previous code was inefficient but not problematic. If this is a performance 
improvement, I suggest to extract it and put it in a different patch, add tests 
and demonstrate the improvement. If it’s a bug - I'd report it as such as well
 * 424cf4ae9544b33a617b6a1a95aee6aad47a98ec I think also deserves a separate 
ticket: it looks like a bug fix, even potentially a pretty serious one. Sorry 
to burden you with this, but I think it’d be great to add a test for it as 
well. 
 * Same goes for beed60a1c0eb6cb5a8612053fd7ea756b52c5ceb.

Hope you find it helpful, looking forward to see changes in the patch and 
potentially landing it.

> Refactor schema management to allow for schema source pluggability
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-17044
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17044
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Schema
>            Reporter: Jacek Lewandowski
>            Assignee: Jacek Lewandowski
>            Priority: Normal
>
> The idea is decompose `Schema` into separate entities responsible for 
> different things. In particular extract what is related to schema storage and 
> synchronization into a separate class so that it is possible to create an 
> extension point there and store schema in a different way than 
> `system_schema` keyspace, for example in etcd. 
> This would also simplify the logic and reduce the number of special cases, 
> make all the things more testable and the logic of internal classes 
> encapsulated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to