comnetwork commented on a change in pull request #3947:
URL: https://github.com/apache/hbase/pull/3947#discussion_r775712662



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
##########
@@ -134,51 +134,36 @@ public static ChunkCreator getInstance() {
     return instance;
   }
 
-  /**
-   * Creates and inits a chunk. The default implementation for a specific 
chunk size.
-   * @return the chunk that was initialized
-   */
-  Chunk getChunk(ChunkType chunkType) {
-    return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType);

Review comment:
       @Apache9 ,I am sorry for not explain clearly.
   The first problem is yes,you are right. 
   The reply to second problem is also yes and the reason is as follows:
   `MemStoreLABImpl.getNewExternalChunk(ChunkType) `is called by following 
three methods:
   1. `MemStoreLABImpl.getNewExternalChunk(int)`,
     ```
    public Chunk getNewExternalChunk(int size) {
       int allocSize = size + ChunkCreator.SIZEOF_CHUNK_HEADER;
       if (allocSize <= ChunkCreator.getInstance().getChunkSize()) {
         return getNewExternalChunk(ChunkCreator.ChunkType.DATA_CHUNK);
       } else {
         Chunk c = this.chunkCreator.getJumboChunk(size);
         chunks.add(c.getId());
         return c;
       }
     }
   ```
   `MemStoreLABImpl.getNewExternalChunk(int)` itself is called by following 
`MemStoreLABImpl.forceCopyOfBigCellInto(Cell)`(we can ignore called by 
`ImmutableMemStoreLAB` because it is just a composite of `MemStoreLAB`),
   
   ```
   public Cell forceCopyOfBigCellInto(Cell cell) {
       int size = Segment.getCellLength(cell);
       Preconditions.checkArgument(size >= 0, "negative size");
       if (size + ChunkCreator.SIZEOF_CHUNK_HEADER <= dataChunkSize) {
         // Using copyCellInto for cells which are bigger than the original 
maxAlloc
         return copyCellInto(cell, dataChunkSize);
       } else {
         Chunk c = getNewExternalChunk(size);
         int allocOffset = c.alloc(size);
         return copyToChunkCell(cell, c.getData(), allocOffset, size);
       }
     }
   ```
   We can see `MemStoreLABImpl.forceCopyOfBigCellInto(Cell)` only calls 
`MemStoreLABImpl.getNewExternalChunk(int)` when 
   `size + ChunkCreator.SIZEOF_CHUNK_HEADER > dataChunkSize`, so when in above 
`MemStoreLABImpl.getNewExternalChunk(int)` because 
   `allocSize > ChunkCreator.getInstance().getChunkSize()`, the incorrect 
`MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is in fact not be called, 
instead `ChunkCreator.getJumboChunk(size) ` which uses the correct 
`IndexType.CHUNK_MAP` is called, so in this code path, 
`MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is actually not used, which 
is fortunate to not cause serious bug now.
   
   2.  `MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is called by 
`CellChunkImmutableSegment.allocIndexChunks`  to allocate 
`IndexChunk`(`CellChunkImmutableSegment` itself is  ` IndexType.CHUNK_MAP`), 
because the `IndexType.ARRAY_MAP`,  it does not put the `IndexChunk` in 
`ChunkCreator.chunkIdMap` when the `IndexChunk` could not be pooled , but it 
does no matter because it is a `IndexChunk` not a `DataChunk`, and `IndexChunk` 
is put in `CellChunkMap.chunks`, which prevents it is GC-ed. If we put the 
IndexChunk in `ChunkCreator.chunkIdMap` when using the correct
   ` IndexType.CHUNK_MAP`, there not cause memory leak because when the 
`MemStoreLAB` is closed, `ChunkCreator.removeChunk` is invoked to remove the 
chunk from this `ChunkCreator.chunkIdMap`
   
   3. `MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is called by 
`ImmutableMemStoreLAB.getNewExternalChunk(ChunkType)`, we could ignore it 
because it is just a composite of `MemStoreLAB`
   
   So In conclusion, we can say it is safe to change 
`MemStoreLABImpl.getNewExternalChunk(ChunkType) ` from  `IndexType.ARRAY_MAP` 
to `IndexType.CHUNK_MAP`, and from another perspective, because the IndexType 
of `MemStoreLABImpl` could only be` IndexType.CHUNK_MAP` ,and chunk could only 
be acquired from `ChunkCreator `through `MemStoreLABImpl`, we could also safely 
to  make `ChunkCreator` to allocate chunk only for ` IndexType.CHUNK_MAP` when 
using `CompactingMemStore`.
   
   For  `DefaultMemStore`, it is also reasonable to always put the chunk in 
`ChunkCreator.chunkIdMap` because(it is also the code behavior before this PR):
   1.When the `MemStoreLAB` which created the chunk is not closed, this chunk 
is used by the `Segment` which references this `MemStoreLAB`, so this chunk 
certainly should not be GC-ed, putting the chunk in `ChunkCreator.chunkIdMap` 
does not prevent useless chunk to be GC-ed. 
   2.When the `MemStoreLAB` which created the chunk is closed, and if the chunk 
is not pooled,  `ChunkCreator.removeChunk` is invoked to remove the chunk from 
this `ChunkCreator.chunkIdMap`,so there is no memory leak.
   
   The main difference between `IndexType.CHUNK_MAP` and `IndexType.ARRAY_MAP` 
is for `IndexType.CHUNK_MAP`, chunk is always put in  `ChunkCreator.chunkIdMap` 
even it is not  pooled.  So in my opinion, it is safe to remove the `IndexType` 
from `ChunkCreator` and always to put the chunk in `ChunkCreator.chunkIdMap` to 
make the code more simpler.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to