dcapwell commented on code in PR #2267:
URL: https://github.com/apache/cassandra/pull/2267#discussion_r1162044295


##########
conf/cassandra.yaml:
##########
@@ -1924,8 +1924,14 @@ drop_compact_storage_enabled: false
 # which is used in some serialization to denote the format type in a compact 
way (such as local key cache); and 'name'
 # which will be used to recognize the format type - in particular that name 
will be used in sstable file names and in
 # stream headers so the name has to be the same for the same format across all 
the nodes in the cluster.
+# The first entry in this list is the format that will be used for 
newly-created SSTables. The other formats given
+# will be used to read any SSTables present in the data directories or 
streamed.
 sstable_formats:
   - class_name: org.apache.cassandra.io.sstable.format.big.BigFormat
     parameters:
       id: 0
       name: big
+  - class_name: org.apache.cassandra.io.sstable.format.bti.BtiFormat
+    parameters:
+      id: 1
+      name: bti

Review Comment:
   Agree with what @maedhroz said, and to extend on that.
   
   1) We figure out which format to use by using 
`org.apache.cassandra.io.sstable.format.SSTableFormat.Type#current`, which uses 
the first element of the list, which is not determinist as it is based off 
`Map` ordering (see 
`org.apache.cassandra.io.sstable.format.SSTableFormat.Type#readFactories`)
   1.1) we use a system property to override, but we should be using a config 
in cassandra.yml.  cassandra.yml is accessible via system properties if that is 
what a user wants, but also plugs in well with the rest of the system.
   1.2) if we do go the config route, we can make this a hot-prop using the 
existing config system
   2) streaming now requires globally consistent configs... this isn't 
realistic without transactional cluster metadata, so cases such as 
`org.apache.cassandra.db.streaming.CassandraStreamHeader.CassandraStreamHeaderSerializer#serialize`
 can have one instance writing "abc" and the other side failing as it doesn't 
know about "abc" (because it defined it as "big").
   
   I strongly feel we should redo this config layer and remove id/name from 
operator control as this isn't safe.  
   
   Now, if you want to add a custom format (I believe one of the goals of the 
pluggable work) then we need a "safe" way to discover, and I feel the only real 
safe way is discovery is via class existence (such as `ServiceLoader` from 
java).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to