[ 
https://issues.apache.org/jira/browse/CASSANDRA-17044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438940#comment-17438940
 ] 

Jacek Lewandowski commented on CASSANDRA-17044:
-----------------------------------------------

I reverted {{SchemaManager}} back to {{Schema}} as it seems there is no 
consensus for that rename. As I said, it is not crucial for this ticket and I 
can be made any time later if needed.

bq. Even though I generally like that SchemaChangeListener is now an interface 
rather than abstract class, I’m concerned about potential users who might have 
relied on it being implemented the way it was. Given this was a useful 
interface, I think we should be more careful with it. I think it’s possible to 
bring backwards compatibility back by creating a wrapper class with the old 
interface. Would still require recompilation for the users, but could will make 
it painless to migrate.

According to the later discussion I assumed we can keep just an interface

bq. Is there any reason we wouldn’t want to make the maps in 
TableMetadataRefCache immutable? I realize they can be sizeable, but in this 
case we could also hide them behind the interface, make default implementation 
immutable and let other implementations be more efficient if necessary. This 
way we can atomically swap/publish to all three maps.

Ok, I've implemented it so that it is immutable. However, I have some concerns 
in that case because the {{TableMetadataRef}} objects are purposely mutable and 
thus we probably should keep the synchronization (although I removed it in the 
commit)

bq. I’m not sure about the name SharedSchema class and nomenclature:
bq. Class itself, if the whole point was to add a version to Keyspaces, maybe 
we should just add the version to Keyspaces, and make sure keyspaces are being 
updated synchronously with the version. Moreover, I think having version here 
and version left in SchemaManager is making it more confusing. I realise that 
you’ve tried to hide the version for SharedSchema in the Default handler, it’s 
unclear whose responsibility the version actually is.

As far as I understand, version not always makes sense with {{Keyspaces}} - for 
example, we only care about the version calculated for non-local keyspaces. So 
I'd prefer the approach with removing that class entirely and keep version 
separate. On the other hand, in a couple of places keyspaces and version are 
passed together to a method and it is quite convenient to keep them in a single 
immutable object.

bq. Nomenclature: I don’t think this schema is “shared” in any way: all 
keyspaces that aren’t local are just nonlocal, and a subset of them is just 
“userDefined”. Maybe we could use “replicated” or “distributed” instead of 
“nonlocal”.

I really had a problem with that. I like {{DistributedSchema}} name and I 
committed the renaming.

bq. Is there anything that justifies the existing of LocalKeyspaces class and 
prevents it from being just an instance of Keyspaces which is more functional 
but seems to be at least equivalent for its current usages?

I have no explanation, perhaps I was working on this for too long and another 
pair of eyes was needed to catch this. I committed removal of 
{{LocalKeyspaces}} in favor of using just {{Keyspaces}}.

bq. reloadSchemaAndAnnounceVersion seems to pick all keyspaces and then filter 
out all system/local ones. Maybe we can just use “shared” (in nomenclature of 
your patch)

It gets modified in later commits.

bq. in SchemaDiagnostics, “schemata” doesn’t seem to be a typo, but rather a 
plural of “schema”. There are a few other mentions of schemata in this file, so 
I’d check it again.

I checked - my understanding is that the schema is a set of all keyspaces 
metadata rather than metadata of a single keyspace. That's why we have 
{{Schema}} class to represent those keyspaces. We only deal with multiple 
schemata during the transformation, in particular we have the old schema and 
the new schema but this is unrelated to the fixed typo.

bq. Should we make a separate implementation of IEndpointStateChangeSubscriber 
within SchemaUpdateHandler, since

I'm not sure if I get what do you mean by that - do you think I should create a 
kinda inner class implementing that interface instead of implementing it on the 
handler level?

bq. There seems to be an unintentional use of DSE here. I do not mind the 
presence of this method though even though it seems to be unused.
bq. Nit: here probably you have meant “local keyspace definitions” or something 
similar.

Hopefully fixed.

bq. Unfortunately, I could not find a full set of passing tests accompanying 
the patch. Could you, when posting the updates, make sure to include a link to 
CircleCI?

Indeed, I haven't run CI yet because I expected (possibly significant) changes 
to be made as a result of the reviews. However, it is not an unproven solution 
- it was first implemented on a fork in DS and all the tests (excpect upgrade 
dtests and flaky) were passing

bq. Lastly, I feel like for a 4k lines of code patch, this one has surprisingly 
few tests. I realise that schema was always a generally undertested area of 
code, but I think our current quality standards do not allow us to do massive 
refactoring without substantial testing. Therefore, I suggest we create fuzz 
test that verifies schema eventual propagation, reads/writes during schema 
changes, and cross-node consistency of schema operations themselves, including 
concurrent ones. What seems to be particularly important to test is the gossip 
propagation of the version ID, which would lead to schema pulls, alongside with 
active schema pushes that is done when there’s a non-local schema change, and 
picking up schema during bootstrap.

This will take some time and seems to be a separate task, perhaps a dependency 
of this ticket. I mean, I'm not fixing schema synchronization or introducing 
any new functionality. If the implemented tests that you mentioned detect a 
problem on trunk, I don't know if this patch should address them (maybe it 
should, but it wouldn't be just a refactoring then)

Not very important in the context of the tests mentioned above, but the PR has 
over 80% coverage on the added/modified lines by running only JVM tests. 


> Refactor schema management to allow for schema source pluggability
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-17044
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17044
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Schema
>            Reporter: Jacek Lewandowski
>            Assignee: Jacek Lewandowski
>            Priority: Normal
>             Fix For: 4.1
>
>
> The idea is decompose `Schema` into separate entities responsible for 
> different things. In particular extract what is related to schema storage and 
> synchronization into a separate class so that it is possible to create an 
> extension point there and store schema in a different way than 
> `system_schema` keyspace, for example in etcd. 
> This would also simplify the logic and reduce the number of special cases, 
> make all the things more testable and the logic of internal classes 
> encapsulated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to