[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215997#comment-15215997
 ] 

Anastasia Braginsky commented on HBASE-14921:
---------------------------------------------

bq. Please say more on this. CellSet is a NavigableMap (not a 
ConcurrentNavigableMap) so I'm missing where we need the 'Concurrent' (is it in 
this patch?)

Indeed CellSet is NavigableMap. However, CellSet has field “delegatee”, whose 
type is ConcurrentNavigableMap. We want CellSet to have different types of 
delegatees, but they need to be ConcurrentNavigableMaps. Hereby, I copy-paste 
the code: 

{quote}

@InterfaceAudience.Private
public class CellSet implements NavigableSet<Cell>  {
  // Implemented on top of a {@link java.util.concurrent.ConcurrentSkipListMap}
  // Differ from CSLS in one respect, where CSLS does "Adds the specified 
element to this set if it
  // is not already present.", this implementation "Adds the specified element 
to this set EVEN
  // if it is already present overwriting what was there previous".
  // Otherwise, has same attributes as ConcurrentSkipListSet
  private final ConcurrentNavigableMap<Cell, Cell> delegatee;

  CellSet(final ConcurrentNavigableMap<Cell, Cell> m) { this.delegatee = m;}
{quote}

bq. Your new names are better. I considered 'flat' Map but shied away given its 
meaning over in spark/scala; I think it will be ok as long as you stick why its 
a 'flat' map in the javadoc on CellFlatMap.

I’ll change the names and add explanations

bq. How do you see this working? We do not control the size of inbound Cells. 
They could have some regularity and they could also be erratic to the extreme 
(What to do when a 1G cell arrives into a column family that up to this has 
been taking on metrics?)

Excellent comment! Indeed we have a problem with Cells bigger then Chunks. So 
we have no choice, but to introduce the special variable-size very-large Chunks 
to support the very-large Cells. We’ll improve the code after the basic 
benchmarking.

bq. I still do not see how the 3 * int is BYTES_IN_CELL. Not important.

I think the problem here (and also in some other questions) is the name “Cell”. 
Because CellFlatMap doesn’t work with “Cell data” or with "true Cells” as you 
are (correctly) using this word. CellFlatMap works with some "cell 
representation”, using those 3 integers you can get all other Cell information, 
what is the “true Cell”. Should I change this to BYTES_FOR_CELL_REPRESENTATION ?

{quote}

It was introduced and off by default as is usual when new features. But as also 
happens this is our practice, the facility was 'forgotten'. It came up then 
when our Lars noticed it and wanted to remove it since it was not being used. 
It came up again recently in HBASE-15513

It would seem to make sense enabling it by default if we come up w/ a proper 
sizing. Having it on seems to mess w/ G1GC too. Would need to figure that.

{quote}
I took a look on HBASE-15513, it is very interesting. It looks like it favors 
turning ChunkPool on by default.It also looks very reasonable to me. I also 
took a very brief look on HBASE-15180. Specifically on the statement:
bq. I noticed about 5-10% improvement on GC times and CPU utilization after 
disabling MSLAB only if using G1GC. Tuning MSLAB helps a little but I don't see 
to much advantage to have it enabled when G1GC is there.
However, I do not see enough evidence in those measurement. How many workloads 
were tested? What where the sizes of Cells? Need to read this Jira more 
carefully.

bq. We need to do up a memory management doc. Between your work on Segments, 
Segment pipelines, MSLAB chunks, chunk pools and bytebufferpools to host 
requests read from sockets, bucket cache and reference counting bucketcache 
bucket blocks at read time, it would be good if we had a map so we could trace 
a Cell on its travels.

 I’ll do the document little later on.

> Memory optimizations
> --------------------
>
>                 Key: HBASE-14921
>                 URL: https://issues.apache.org/jira/browse/HBASE-14921
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Anastasia Braginsky
>         Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to