echauchot commented on PR #21636:
URL: https://github.com/apache/flink/pull/21636#issuecomment-1383753431

   > Hey @echauchot Thanks for working on the issue.
   > 
   > Correct me if I am wrong, but in the current state you always apply the 
compression for the operator state.
   > 
   
   No, see `DefaultOperatorStateBackendBuilder` I call 
`AbstractStateBackend.getCompressionDecorator` that reads 
`execution.checkpointing.snapshot-compression` conf parameter.
   
   > First of all, imo, this needs to be configurable. I'd say it can be done 
with the same flag as for the keyed state backend 
(`execution.checkpointing.snapshot-compression`)
   
   See above, I've found we could reuse 
`execution.checkpointing.snapshot-compression` conf parameter. Tell me if you 
want it to be a different conf parameter. 
   
    
   > Moreover, right now it causes previously written checkpoints to be 
unreadable. (have you tried restoring from an old checkpoint? If not, could you 
add such a test?) as it always assumes the stream is compressed. 
   > The info if the state is compressed or not needs to be persisted 
somewhere. I don't have a good idea for that now. In the keyed state case it is 
done via `KeyedBackendSerializationProxy#read/write`.
   
   Well, if the user disables compression to read uncompressed snapshots, then 
it works.
   Though, I have not tested to restore old snapshots. I have just tested the 
state write/read roundtrip via existing tests. I'll add such a test
   
   Yes I saw that with key state a boolean is written first when the state is 
written. I can do similar thing for operator state.
   
   > 
   > Now I realise this might require changes to the checkpoint format, unless 
we figure out a smart place to put it, so we might need to create a small FLIP 
for that.
   
   Ok, adding the boolean will require to change the snapshot format indeed. 
This is why I have not added this boolean to avoid the format change. But, 
indeed, if a snapshot was compressed and the user then disables compression he 
could not read the previous compressed snapshot with my code. What about simply 
adding a compression boolean  to operator state meta ? Need a FLIP for this 
change ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to