comnetwork commented on a change in pull request #3947:
URL: https://github.com/apache/hbase/pull/3947#discussion_r775712662
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChunkCreator.java
##########
@@ -134,51 +134,36 @@ public static ChunkCreator getInstance() {
return instance;
}
- /**
- * Creates and inits a chunk. The default implementation for a specific
chunk size.
- * @return the chunk that was initialized
- */
- Chunk getChunk(ChunkType chunkType) {
- return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType);
Review comment:
@Apache9 ,I am sorry for not explain clearly.
The first problem is yes,you are right.
The reply to second problem is also yes and the reason is as follows:
`MemStoreLABImpl.getNewExternalChunk(ChunkType) `is called by following
three methods:
1. `MemStoreLABImpl.getNewExternalChunk(int)`,
```
public Chunk getNewExternalChunk(int size) {
int allocSize = size + ChunkCreator.SIZEOF_CHUNK_HEADER;
if (allocSize <= ChunkCreator.getInstance().getChunkSize()) {
return getNewExternalChunk(ChunkCreator.ChunkType.DATA_CHUNK);
} else {
Chunk c = this.chunkCreator.getJumboChunk(size);
chunks.add(c.getId());
return c;
}
}
```
`MemStoreLABImpl.getNewExternalChunk(int)` itself is called by following
`MemStoreLABImpl.forceCopyOfBigCellInto(Cell)`(we can ignore called by
`ImmutableMemStoreLAB` because it is just a composite of `MemStoreLAB`),
```
public Cell forceCopyOfBigCellInto(Cell cell) {
int size = Segment.getCellLength(cell);
Preconditions.checkArgument(size >= 0, "negative size");
if (size + ChunkCreator.SIZEOF_CHUNK_HEADER <= dataChunkSize) {
// Using copyCellInto for cells which are bigger than the original
maxAlloc
return copyCellInto(cell, dataChunkSize);
} else {
Chunk c = getNewExternalChunk(size);
int allocOffset = c.alloc(size);
return copyToChunkCell(cell, c.getData(), allocOffset, size);
}
}
```
We can see `MemStoreLABImpl.forceCopyOfBigCellInto(Cell)` only calls
`MemStoreLABImpl.getNewExternalChunk(int)` when
`size + ChunkCreator.SIZEOF_CHUNK_HEADER > dataChunkSize`, so when in above
`MemStoreLABImpl.getNewExternalChunk(int)` because
`allocSize > ChunkCreator.getInstance().getChunkSize()`, the incorrect
`MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is in fact not be called,
instead `ChunkCreator.getJumboChunk(size) ` which uses the correct
`IndexType.CHUNK_MAP` is called, so in this code path,
`MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is actually not used, which
is fortunate to not cause serious bug now.
2. `MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is called by
`CellChunkImmutableSegment.allocIndexChunks` to allocate
`IndexChunk`(`CellChunkImmutableSegment` itself is ` IndexType.CHUNK_MAP`),
because the `IndexType.ARRAY_MAP`, it does not put the `IndexChunk` in
`ChunkCreator.chunkIdMap` when the `IndexChunk` could not be pooled , but it
does no matter because it is a `IndexChunk` not a `DataChunk`, and `IndexChunk`
is put in `CellChunkMap.chunks`, which prevents it is GC-ed. If we put the
IndexChunk in `ChunkCreator.chunkIdMap` when using the correct
` IndexType.CHUNK_MAP`, there not cause memory leak because when the
`MemStoreLAB` is closed, `ChunkCreator.removeChunk` is invoked to remove the
chunk from this `ChunkCreator.chunkIdMap`
3. `MemStoreLABImpl.getNewExternalChunk(ChunkType) ` is called by
`ImmutableMemStoreLAB.getNewExternalChunk(ChunkType)`, we could ignore it
because it is just a composite of `MemStoreLAB`
So In conclusion, we can say it is safe to change
`MemStoreLABImpl.getNewExternalChunk(ChunkType) ` from `IndexType.ARRAY_MAP`
to `IndexType.CHUNK_MAP`, and from another perspective, because the IndexType
of `MemStoreLABImpl` could only be` IndexType.CHUNK_MAP` ,and chunk could only
be acquired from `ChunkCreator `through `MemStoreLABImpl`, we could also safely
to make `ChunkCreator` to allocate chunk only for ` IndexType.CHUNK_MAP` when
using `CompactingMemStore`.
For `DefaultMemStore`, it is also reasonable to always put the chunk in
`ChunkCreator.chunkIdMap` because(it is also the code behavior before this PR):
1.When the `MemStoreLAB` which created the chunk is not closed, this chunk
is used by the `Segment` which references this `MemStoreLAB`, so this chunk
certainly should not be GC-ed, putting the chunk in `ChunkCreator.chunkIdMap`
does not prevent useless chunk to be GC-ed.
2.When the `MemStoreLAB` which created the chunk is closed, and if the chunk
is not pooled, `ChunkCreator.removeChunk` is invoked to remove the chunk from
this `ChunkCreator.chunkIdMap`,so there is no memory leak.
So in my opinion, it is safe to remove the `IndexType` from `ChunkCreator`
and always to put the chunk in `ChunkCreator.chunkIdMap` to make the code more
simpler.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]