mynameborat commented on pull request #1475:
URL: https://github.com/apache/samza/pull/1475#issuecomment-799476540


   I am not sure if the approach taken in the PR is the ideal way to go about 
achieving the changes you need. I see few problems with delegating `null` 
checks from `BaseKeyValueStorageEngineFactory` to the extenders. 
   
   1. In `BaseKeyValueStorageEngineFactory`, we do multiple layers of wrapping 
stores (logged store, access logged store, serialized store) and so on.. By 
removing the checks, you are breaking this invariant of non-null `Serde` to all 
of steps that follow construction of the higher layer store (RocksDb or 
InMemory etc).
   2. Particularly, this is problematic for the construction of 
`SerializedKeyValueStore` and `AccessLoggedStore` because the former assumes 
both `KeySerde` and `ValueSerde` to be null and the latter assumes `KeySerde` 
to be non-null.
   3. Forces high layer `StorageEngineFactory` to handle `KeySerde` and 
`ValueSerde` during construction even though it is not necessary. For e.g. 
`RocksDbKeyValueStorageEngineFactory` and `InMemoryKeyValueStorageEngineFactory`
   4. Introduces potential problems with `Serde`; i.e. Higher layer storage 
engines mistakenly use the `Serde` themselves and also have wrapped store 
handles which potentially does `Serde` as well. 
   
   It will be good if you can take into consideration the following as part of 
your approach
   
   - Do storage engines mandate over what kind of wrapping is allowed? e.g. 
logged store and/or serialized store etc
   - Do end users continue to describe properties of store that indirectly 
determines the wrapping applied? e.g. large message enabled store (vs cached 
store vs access logged store
   - Does the framework determine the combination of configurations that drive 
the wrapping strategy?
   
   Potential solutions I can think of
   1. Introduce internal flag or have explicit store properties that drives how 
wrapping is accomplished and determines if a store is `SerializedKeyValueStore` 
or not
   2. Instead of removing the null check, push it down within 
`BaseKeyValueStorageEngineFactory` and modify how wrapping is accomplished 
either based on storage engines.
   3. If flexibility is the key, introduce capability for storage engines to 
implement that handles raw store without any these goodies and wrapping so it 
has all the control it needs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to