hanishakoneru opened a new pull request #3034:
URL: https://github.com/apache/ozone/pull/3034


   ## What changes were proposed in this pull request?
   
   Negative BlockCount and UsedBytes has been observed in ContainerDB.
   
   The BlockCount is incremented only when the Stream is closed and not when 
the BlockID is added to the DB. If the OutputStream was not closed properly or 
if, for any reason, the client starts writing to a new pipeline before the full 
block is written, it could lead to a Block being present in the container but 
the key_count (BlockCount) not being incremented for it. When a block is 
deleted from a container, the blockCount is also decremented. But if the 
blockCount is wrong to start with, it could lead to a negative value.
   
   When a block is deleted, usedBytes is decrement in memory first after 
deleting a chunk. And even if the chunkFile does not exist (already deleted), 
the usedBytes is decremented. This could lead to usedBytes being decremented 
multiple times for the same chunk and causing the total usedBytes metadata in 
the DB to become negative. Once all the chunks in all the blocks in that 
iteration of BlockDeletingService task are deleted, only then is the usedBytes 
updated in containerDB by taking the in-memory value. This Jira proposes to 
first update the DB with correct usedBytes (calculated from the BlockInfo after 
all chunks are deleted) and then update the in-memory metadata. This is the 
update sequence logic followed for all other state updates. 
   
   Also, when a chunk is overwritten, then it is assumed that the size of the 
chunk remains the same. But it’s possible to overwrite more data into the chunk 
than originally present. In this case, the used_bytes should be updated with 
difference in the chunkSizes. (Adding this as a TODO).
   
   This PR has the following fixes:
   
   1. Change Container DB KeyCount to BlockCount (to avoid confusion on what 
the keyCount denotes in ContainerData)
   2. Fix BlockCount update logic
   3. Fix UsedBytes decrement logic - update container DB first and then 
in-memory
   4. Separate ContainerData metrics like read/write counts and bytes to 
different class
   5. Cleanup and deprecate old Block and Chunk Delete methods
   6. Add a test to verify blockCount and usedBytes are updated correctly on 
write failures
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5359
   
   ## How was this patch tested?
   
   Added a unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to