[
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212838#comment-15212838
]
stack commented on HBASE-14921:
-------------------------------
Reviewed again after another read of the attached diagram.
Does the 'CellBlock' Map have to be a Concurrent Map? This is a readonly
datastructure once made? No need of the 'Concurrent'?
If so, you made CellSet (or Eshcar) made CellSet which implements NaviableSet.
Here we are slinging Cells and implement NavigableMap. Call it CellMap rather
than CellBock? I was trying to come up with a name that made reference to how
it is actually implemented -- e.g. like ConcurrentSkipListMap -- which would
seem to say all it CellArrayMap rather than CellBlock but when I see
CellBlockObjectArray later.. may be CellBinSearchMap....
It is odd that comparator is passed on construction and when we call
createCellBlocks. Should constructor be protected so folks can't construct one
w/o there being data in it? Should we save the comparator when createCellBlocks
rather than pass on construction? Then we have to make sure the comparators
agree? Or do we? Seems like you need it on construction so can propagate
instances with the createCellBlocks call. Seems like allowing passing different
comparators would make for trouble so don't allow passing comparator on
createCellBlocks?
Why name the method getCellFromIndex(int i)? Should it be get(i) because we are
getting the ith Cell in Map? (The comment says that this is what it is doing)
It'd be cool to compare a CSLM to one of these CAMs at different scales to see
difference in perf in a test bench.
For CellBlockObjectArray, it should be called CellArrayMap?
What does "CellBlockObjectArray is a simple array of Cells allocated using
JVM." mean? All our stuff is on a JVM (smile)
Below should be final
Cell[] block;
Is the below total Cels?
private int numOfCellsInsideChunk;
It looks like it is Cells per chunk? This is constant for all Chunks in a
CellBlockSerialized?
(BTW CellBlockSerialized could be ChunkCellMap or CellMLABMap
This seems to be the 'key', or 'index key' size, not bytes in a cell?: private
static final int BYTES_IN_CELL = 3*(Integer.SIZE / Byte.SIZE); // each Cell
requires 3 integers
I love the way this works. Very nice. Simple. Only bit is how you ensure all
Chunks have same amount of Cells?
Remove this stuff rather than just comment it out:
//c = (chunkPool != null) ? chunkPool.getChunk() : new Chunk(chunkSize,
5); //14921
197
The 14921 everywhere should be stripped. Its HBASE-14921?
Is this a hardcoding?
201 c = new Chunk(chunkSize, 5);
The Chunk Pool needs to be on for this all to work? Is that in this patch?
For...
* Only used in testing
we usually mark with @VisibleForTestiing annotation.
Patch looks great.
> Memory optimizations
> --------------------
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
> Issue Type: Sub-task
> Affects Versions: 2.0.0
> Reporter: Eshcar Hillel
> Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf,
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch,
> HBASE-14921-V02.patch
>
>
> Memory optimizations including compressed format representation and offheap
> allocations
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)