ChenSammi commented on PR #3908: URL: https://github.com/apache/ozone/pull/3908#issuecomment-1345817295
@symious , thanks for the review. > 1. The interface we are using is `check()`, but `compaction` seems more than a check operation since it's updating the rocksdb. > > 2. The performance issue might not be annoying, Users or Devs might not be easy to find their cluster slow caused by this operation. > > > I was wondering if we don't do the compaction operation in `check()`, but only add a `need to compact` tag. Meanwhile, we can have a compaction thread doing the real compaction when Datanode is not busy so that the Datanode won't be affected by the compaction. Rocksdb has implemented somehow the way you suggested. rocksdb.compactRange will return immediately. The real compaction will happen in another background thread per RocksDB's schedule. About the performance concern, I have did some calculation about the rocksdb size of single volume. Say volume is 16TB, in small object/file situation, say 1MB each file, there will be 16 million files(one block each file). Currently, one block with one chunk is 63 byte(protobuf serialized), it's totally about 1GB for all block data. Consider there is other data in rocksdb, the total size of rocksdb per 16TB volume could be no more than several GB, or less than 1GB if the files are big files. So I'm not very worry about the rocksdb performance on datanode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
