ChenSammi commented on PR #3908:
URL: https://github.com/apache/ozone/pull/3908#issuecomment-1345817295

   @symious , thanks for the review.  
   
   >     1. The interface we are using is `check()`, but `compaction` seems 
more than a check operation since it's updating the rocksdb.
   > 
   >     2. The performance issue might not be annoying, Users or Devs might 
not be easy to find their cluster slow caused by this operation.
   > 
   > 
   > I was wondering if we don't do the compaction operation in `check()`, but 
only add a `need to compact` tag. Meanwhile, we can have a compaction thread 
doing the real compaction when Datanode is not busy so that the Datanode won't 
be affected by the compaction.
   
   Rocksdb has implemented somehow the way you suggested.  rocksdb.compactRange 
will return immediately. The real compaction will happen in another background 
thread per RocksDB's schedule.  
   
   About the performance concern, I have did some calculation about the rocksdb 
size of single volume. Say volume is 16TB, in small object/file situation, say 
1MB each file, there will be 16 million files(one block each file).  Currently, 
one block with one chunk is 63 byte(protobuf serialized), it's totally about 
1GB for all block data.  Consider there is other data in rocksdb, the total 
size of rocksdb per 16TB volume could be no more than several GB, or less than 
1GB if the files are big files.  So I'm not very worry about the rocksdb 
performance on datanode.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to