Re: Review Request 51785: [HBASE-16608] Merge for the segments in the compaction pipeline and simplifying the user interface for in-memory flush

Anoop Sam John Tue, 18 Oct 2016 04:49:31 -0700


> On Oct. 16, 2016, 10:43 p.m., Anoop Sam John wrote:
> > hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HeapMemStoreLAB.java,
> >  line 179
> > <https://reviews.apache.org/r/51785/diff/7-8/?file=1511963#file1511963line179>
> >
> >     Able to get what u r trying to do here.  Seems very complex now.
> >     So can we take a diff path?
> >     We have to merge N segments into one here means we have to merge N 
> > MSLABs referred by these segments.  What we do as per this patch is make a 
> > new HeapMSLAB object and add all pooled chunk from other MSLABs int the Q 
> > in new one.  And other stuff of refer next etc.
> >     
> >     Can we make a new MSLAB impl? This will act as a collection of N 
> > HeapMSLABs. Now we have to impl 4 methods
> >     incScannerCount and decScannerCount based on an openScannerCount in the 
> > new class.
> >     close() method like in HeapMSLAB and has to call close() on each of the 
> > N MSLABs it wrap
> >     This is an ImmutableSegment so no one is going to call copyCellInto(). 
> > Make this impl in this new class throw IllegalStateException or so.  Will 
> > that look good?
> 
> Michael Stack wrote:
>     No copying? That sounds good.
> 
> Anoop Sam John wrote:
>     So this fix and so this comment is applicable to merge op alone..  There 
> is still an open point which Me/Ram are not agreeing around this.. That is 
> whether to do this in mem merge.  This is required to limit the #segments it 
> pipeline and to get bigger sized segment at tail of the pipeline. As we do 
> flush of the tail of the segment.  But our point is let us flush all segments 
> (I mean flush to disk)..  In case of in mem compaction (This action is by def 
> OFF and as per data usage one can enable) it is worth to flush only tail.  In 
> this normal use case let us flush all segments and so there is only flatten 
> of segments comes in and so no bug as this. There is absolutely no gain we 
> get because of delaying of flush of the other segments in the pipeline ( I 
> mean in merge kind of usages)..   So I suggest let us discuss on that before 
> going to work on this. Else it wl b waste of time.
> 
> Anastasia Braginsky wrote:
>     I am referencing in this answer the Anoop's suggestion to create a new 
> MSLAB implementation that act as a collection of N old HeapMSLABs. Please pay 
> attention that old scans (opened on N old HeapMSLABs) are not aware of 
> existing of the new MSLAB. In addition the close() of the new MSLAB doesn't 
> automatically means old MSLABs need to be closed, as there might be still 
> ongoing scans. Thus I do not think this idea works (at least not as it is 
> presented now).
> 
> Anastasia Braginsky wrote:
>     Anoop, I do not understand about what exactly you disagree. Are you 
> saying that merging of the segments is waste of time? Do you prefer to do a 
> merge upon the flush to disk? Aa we have seen on stress benchmarks this 
> behavior delays flushes and blocks writes which is undesirable. Please 
> explain yourself clearer or we can talk about if we have a meeting this 
> Wednesday.
> 
> Anoop Sam John wrote:
>     Reply
>     First abt the COllection of N HeapMSLAB wont work.
>     You say problem with old scans which were working on old HeapMSLABs and 
> they dont know abt the new MSLAB.  There wont be an issue.  We will really 
> close an MSLAB (I mean return chunks to pool) when there is a close call 
> happened on it AND the scanner count is zero.  So in the case u mentioned, as 
> there are old scan, the scanner count on these MSLABs are >0. Remember we are 
> not going to call close() on these.  We just have a new Collection based 
> MSLAB impl which point to N old MSLABs.  So the old scanners still use the 
> old MSLABs and once done they decr the counter.
>     Now on the new one there might be new scan reqs.  So we have scanner 
> count in new COllection based MSLAB impl.  When we are at a point to close 
> this new MSLAB, we will call close on it. It will call close() on all old 
> MSLABs. (If its scanner count is zero or else the close on old MSLABs will be 
> delayed till a decr op makes the count to 0)  Now again within the old MSLAB 
> there will be normal close flow.. Iff its scanner count is 0, it will really 
> get closed.  So in a rare case where the new MSLAB is made and it is at a 
> point of getting closed and still a very old scanner still uses old MSLAB, 
> still no issue.. The one old MSLAB will get really closed only when the long 
> pending scan ends.
>     
>     Second abt disagreement of merge
>     Ya. I feel that there should NOT be an in memory merge op at all.  We 
> need to have the compaction op any way.  Only in case of compaction op mode, 
> the on disk flush will flush only tail of the pipeline.  In other case (def 
> case) there will be in memory flushes (flatten op) and we keep adding 
> segments to pipeline and here when a flush to disk req comes, we will flush 
> all segments not just tail of pipeline..  We try to do this merge only to 
> make the #segments in pipeline less and to have a bigger sized tail for 
> pipeline. No need to do merge at any point of time.. The flush will any way 
> use a scanner (With SQM and all) and let us make that scanner to be on all 
> segmnets.
>     
>     Am I making it clear..  Because of 2nd I suggested let us not fix 
> comments regarding merge.
> 
> ramkrishna vasudevan wrote:
>     I agree with the N MSLAB impl that Anoop says here. It should work. As i 
> said in the comment in the JIRA the place that you have added the new APIs I 
> was not sure if they are really correct but NMSLAB impl should work out 
> because ongoing scans on old MSLAB should stil use them and the inc/decr 
> operation should work on them seemlessly and that once they are added to the 
> new N MSLAB and any new scan on them would have any way incremented the count 
> of the oldMSLAB either individually or as a whole so some  how they don't get 
> closed.
> 
> Edward Bortnikov wrote:
>     Just a general comment - 
>     
>     It's important to keep the mechanism compatible with the original intent 
> of doing index or data compaction while the data is in memory. Therefore, we 
> can't just delay the segments merge until the flush to disk. The segments 
> pipeline should have freedom to compact upon the its own timing (matter of 
> policy). 
>     
>     Any design simplification that preserves this degree of freedom is 
> welcome.


So there should be a policy which allows not to do any merge at all.  And in 
that case flush only tail of pipeline is going to a probelm (I think u all 
agreed with that already).. That is why said in such cases, all segments should 
be flushed to disk.  There is no adv at all we are getting in delaying flush of 
other regions in such a case.


- Anoop Sam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51785/#review152809
-----------------------------------------------------------


On Oct. 14, 2016, 12:49 p.m., Anastasia Braginsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51785/
> -----------------------------------------------------------
> 
> (Updated Oct. 14, 2016, 12:49 p.m.)
> 
> 
> Review request for hbase.
> 
> 
> Repository: hbase-git
> 
> 
> Description
> -------
> 
> This is a step toward final compacting memstore that allowes two modes of 
> work: index-compaction and data-compaction. 
> 
> The index-compaction means that when the new segment is pushed into the 
> pipeline, it is flattened and probably merged with old segments in the 
> pipeline. The new merge "feature" induces no data-copy-compaction and no 
> speculative SQM scan. 
> The compacting memstore of the data-compaction type means the usage of the 
> data-copy-compaction.
> 
> 
> Diffs
> -----
> 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactingMemStore.java
>  177f222 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionPipeline.java
>  6a13f43 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HeapMemStoreLAB.java
>  378601d 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ImmutableSegment.java
>  12b7916 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactor.java
>  714ffe3 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactorIterator.java
>  9798ec2 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactorSegmentsIterator.java
>  PRE-CREATION 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreMergerSegmentsIterator.java
>  PRE-CREATION 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSegmentsIterator.java
>  PRE-CREATION 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SegmentFactory.java
>  510ebbd 
>   
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/VersionedSegmentsList.java
>  2e8bead 
>   
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactingMemStore.java
>  211a6d8 
>   
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactingToCellArrayMapMemStore.java
>  f89a040 
>   
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java
>  1ea5112 
>   
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestPerColumnFamilyFlush.java
>  6bfaa59 
>   
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWalAndCompactingMemStoreFlush.java
>  74826b0 
> 
> Diff: https://reviews.apache.org/r/51785/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Anastasia Braginsky
> 
>

Re: Review Request 51785: [HBASE-16608] Merge for the segments in the compaction pipeline and simplifying the user interface for in-memory flush

Reply via email to