> On Oct. 16, 2016, 10:43 p.m., Anoop Sam John wrote:
> > hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HeapMemStoreLAB.java,
> > line 179
> > <https://reviews.apache.org/r/51785/diff/7-8/?file=1511963#file1511963line179>
> >
> > Able to get what u r trying to do here. Seems very complex now.
> > So can we take a diff path?
> > We have to merge N segments into one here means we have to merge N
> > MSLABs referred by these segments. What we do as per this patch is make a
> > new HeapMSLAB object and add all pooled chunk from other MSLABs int the Q
> > in new one. And other stuff of refer next etc.
> >
> > Can we make a new MSLAB impl? This will act as a collection of N
> > HeapMSLABs. Now we have to impl 4 methods
> > incScannerCount and decScannerCount based on an openScannerCount in the
> > new class.
> > close() method like in HeapMSLAB and has to call close() on each of the
> > N MSLABs it wrap
> > This is an ImmutableSegment so no one is going to call copyCellInto().
> > Make this impl in this new class throw IllegalStateException or so. Will
> > that look good?
>
> Michael Stack wrote:
> No copying? That sounds good.
>
> Anoop Sam John wrote:
> So this fix and so this comment is applicable to merge op alone.. There
> is still an open point which Me/Ram are not agreeing around this.. That is
> whether to do this in mem merge. This is required to limit the #segments it
> pipeline and to get bigger sized segment at tail of the pipeline. As we do
> flush of the tail of the segment. But our point is let us flush all segments
> (I mean flush to disk).. In case of in mem compaction (This action is by def
> OFF and as per data usage one can enable) it is worth to flush only tail. In
> this normal use case let us flush all segments and so there is only flatten
> of segments comes in and so no bug as this. There is absolutely no gain we
> get because of delaying of flush of the other segments in the pipeline ( I
> mean in merge kind of usages).. So I suggest let us discuss on that before
> going to work on this. Else it wl b waste of time.
>
> Anastasia Braginsky wrote:
> I am referencing in this answer the Anoop's suggestion to create a new
> MSLAB implementation that act as a collection of N old HeapMSLABs. Please pay
> attention that old scans (opened on N old HeapMSLABs) are not aware of
> existing of the new MSLAB. In addition the close() of the new MSLAB doesn't
> automatically means old MSLABs need to be closed, as there might be still
> ongoing scans. Thus I do not think this idea works (at least not as it is
> presented now).
>
> Anastasia Braginsky wrote:
> Anoop, I do not understand about what exactly you disagree. Are you
> saying that merging of the segments is waste of time? Do you prefer to do a
> merge upon the flush to disk? Aa we have seen on stress benchmarks this
> behavior delays flushes and blocks writes which is undesirable. Please
> explain yourself clearer or we can talk about if we have a meeting this
> Wednesday.
>
> Anoop Sam John wrote:
> Reply
> First abt the COllection of N HeapMSLAB wont work.
> You say problem with old scans which were working on old HeapMSLABs and
> they dont know abt the new MSLAB. There wont be an issue. We will really
> close an MSLAB (I mean return chunks to pool) when there is a close call
> happened on it AND the scanner count is zero. So in the case u mentioned, as
> there are old scan, the scanner count on these MSLABs are >0. Remember we are
> not going to call close() on these. We just have a new Collection based
> MSLAB impl which point to N old MSLABs. So the old scanners still use the
> old MSLABs and once done they decr the counter.
> Now on the new one there might be new scan reqs. So we have scanner
> count in new COllection based MSLAB impl. When we are at a point to close
> this new MSLAB, we will call close on it. It will call close() on all old
> MSLABs. (If its scanner count is zero or else the close on old MSLABs will be
> delayed till a decr op makes the count to 0) Now again within the old MSLAB
> there will be normal close flow.. Iff its scanner count is 0, it will really
> get closed. So in a rare case where the new MSLAB is made and it is at a
> point of getting closed and still a very old scanner still uses old MSLAB,
> still no issue.. The one old MSLAB will get really closed only when the long
> pending scan ends.
>
> Second abt disagreement of merge
> Ya. I feel that there should NOT be an in memory merge op at all. We
> need to have the compaction op any way. Only in case of compaction op mode,
> the on disk flush will flush only tail of the pipeline. In other case (def
> case) there will be in memory flushes (flatten op) and we keep adding
> segments to pipeline and here when a flush to disk req comes, we will flush
> all segments not just tail of pipeline.. We try to do this merge only to
> make the #segments in pipeline less and to have a bigger sized tail for
> pipeline. No need to do merge at any point of time.. The flush will any way
> use a scanner (With SQM and all) and let us make that scanner to be on all
> segmnets.
>
> Am I making it clear.. Because of 2nd I suggested let us not fix
> comments regarding merge.
>
> ramkrishna vasudevan wrote:
> I agree with the N MSLAB impl that Anoop says here. It should work. As i
> said in the comment in the JIRA the place that you have added the new APIs I
> was not sure if they are really correct but NMSLAB impl should work out
> because ongoing scans on old MSLAB should stil use them and the inc/decr
> operation should work on them seemlessly and that once they are added to the
> new N MSLAB and any new scan on them would have any way incremented the count
> of the oldMSLAB either individually or as a whole so some how they don't get
> closed.
>
> Edward Bortnikov wrote:
> Just a general comment -
>
> It's important to keep the mechanism compatible with the original intent
> of doing index or data compaction while the data is in memory. Therefore, we
> can't just delay the segments merge until the flush to disk. The segments
> pipeline should have freedom to compact upon the its own timing (matter of
> policy).
>
> Any design simplification that preserves this degree of freedom is
> welcome.
>
> Anoop Sam John wrote:
> So there should be a policy which allows not to do any merge at all. And
> in that case flush only tail of pipeline is going to a probelm (I think u all
> agreed with that already).. That is why said in such cases, all segments
> should be flushed to disk. There is no adv at all we are getting in delaying
> flush of other regions in such a case.
And this is the collection based MSLAB new impl I was thinking of.. Just tried
to make a class and JFYI placing here.
public class CollectionBackedMemstoreLAB implements MemStoreLAB {
private final AtomicInteger openScannerCount = new AtomicInteger();
private volatile boolean closed = false;
private final List<MemStoreLAB> mslabs;
public CollectionBackedMemstoreLAB(List<MemStoreLAB> mslabs) {
this.mslabs = mslabs;
}
@Override
public Cell copyCellInto(Cell cell) {
throw new IllegalStateException();
}
@Override
public void close() {
this.closed = true;
if (openScannerCount.get() == 0) {
for (MemStoreLAB mslab : this.mslabs) {
mslab.close();
}
}
}
@Override
public void incScannerCount() {
this.openScannerCount.incrementAndGet();
}
@Override
public void decScannerCount() {
int count = this.openScannerCount.decrementAndGet();
if (this.closed && count == 0) {
for (MemStoreLAB mslab : this.mslabs) {
mslab.close();
}
}
}
}
- Anoop Sam
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51785/#review152809
-----------------------------------------------------------
On Oct. 14, 2016, 12:49 p.m., Anastasia Braginsky wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51785/
> -----------------------------------------------------------
>
> (Updated Oct. 14, 2016, 12:49 p.m.)
>
>
> Review request for hbase.
>
>
> Repository: hbase-git
>
>
> Description
> -------
>
> This is a step toward final compacting memstore that allowes two modes of
> work: index-compaction and data-compaction.
>
> The index-compaction means that when the new segment is pushed into the
> pipeline, it is flattened and probably merged with old segments in the
> pipeline. The new merge "feature" induces no data-copy-compaction and no
> speculative SQM scan.
> The compacting memstore of the data-compaction type means the usage of the
> data-copy-compaction.
>
>
> Diffs
> -----
>
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactingMemStore.java
> 177f222
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionPipeline.java
> 6a13f43
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HeapMemStoreLAB.java
> 378601d
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ImmutableSegment.java
> 12b7916
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactor.java
> 714ffe3
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactorIterator.java
> 9798ec2
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreCompactorSegmentsIterator.java
> PRE-CREATION
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreMergerSegmentsIterator.java
> PRE-CREATION
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSegmentsIterator.java
> PRE-CREATION
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SegmentFactory.java
> 510ebbd
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/VersionedSegmentsList.java
> 2e8bead
>
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactingMemStore.java
> 211a6d8
>
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactingToCellArrayMapMemStore.java
> f89a040
>
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStoreLAB.java
> 1ea5112
>
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestPerColumnFamilyFlush.java
> 6bfaa59
>
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWalAndCompactingMemStoreFlush.java
> 74826b0
>
> Diff: https://reviews.apache.org/r/51785/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Anastasia Braginsky
>
>