So the sweep tool is completely optional? A deploy won't degrade if the sweep tool is never run? Then that sounds good.
On Thursday, May 28, 2015, ramkrishna vasudevan < [email protected]> wrote: > bq.So is a MR runtime required for MOB or not? I read maybe, then no, then > here maybe again. What happens if one does not have a MR runtime and > therefore can never run the sweeper tool? > Just to make it clear, now MOB does not have MR dependency. The V1 version > had a sweeper tool that was dependent on MR. The tool exists even now and > that still depends on MR. Its like an add on. > > The compaction of MOB is now embedded as part of the core feature of > compaction without having to use MR. > > Regards > Ram > > On Thu, May 28, 2015 at 10:20 AM, Andrew Purtell <[email protected] > <javascript:;>> > wrote: > > > I have no concerns about MOB in trunk. Go for it. > > > > I do have concerns about a subsequent proposal to put it in 1.2. Those > > concerns center around stability and performance impacts, and a possible > > dependency on a MR runtime for what I would consider core function. > > > > > Regarding the tools and integrity checks, > > MOB has a tool based on MR basically for sweeping and compaction apart > from > > the compactor that runs in the core (without MR dependency). > > > > So is a MR runtime required for MOB or not? I read maybe, then no, then > > here maybe again. What happens if one does not have a MR runtime and > > therefore can never run the sweeper tool? An incomplete feature on trunk > > isn't a problem. Later commits can fill in the gaps and then the sum of > MOB > > commits can go back to branch-1. (Experimental != incomplete, IMHO.) > > > > If as you say stability and performance testing have already be done and > > both look great, then that means *when* this is done again for a branch-1 > > merge candidate, the results will likely also be good. I'd like to help > out > > with this. You won't need to prove it, I will do the legwork for my own > > concerns. > > > > > > > On May 27, 2015, at 8:59 PM, ramkrishna vasudevan < > > [email protected] <javascript:;>> wrote: > > > > > > Chiming late here, > > > > > > As Matt suggested earlier, utmost care had been taken to ensure that > the > > > MOB code does not interfere with the normal flow and ensured that > things > > > work normally when MOB is not enabled on a family. > > > > > > So the entire flow for MOB can be treated as an experimental feature, > if > > > need be. Take the latest case of guys from Huawei, since they have > some > > > interest in this feature they are trying the branch hbase-11339 and > > trying > > > to see how MOB works. > > > > > > If we move this to trunk, then chances of even more people looking into > > it > > > and by the time it comes to 1.3 or1.4 we are stable enough. > > > > > > Regarding the tools and integrity checks, > > > MOB has a tool based on MR basically for sweeping and compaction apart > > from > > > the compactor that runs in the core (without MR dependency). We could > > > always add feature to the existing tool to do integrity checks like Jon > > > suggests. > > > > > > .Also for an experimental feature we could always come up with such a > > tool, > > > but in case of MOB the inter dependency on the MOB and actual HFile > data > > is > > > more so just a stand alone too to check integrity on the Hfile may not > be > > > easy without having to do some sort of scan on the Hfiles and MOB > files. > > > (Not thought on that fully). > > > > > > I would still think that having this feature as experimental in 1.2 > makes > > > sense. Just my thoughts on this also after being part on the dev > process > > > for this feature where we tried not to touch the core areas affecting > non > > > MOB cases. > > > > > > Some of the perf results performed by Jingcheng's team and Cloudera > folks > > > substantiates the gain this feature provides. > > > > > > Regards > > > Ram > > > > > > > > > > > > > > >> On Thu, May 28, 2015 at 9:04 AM, Andrew Purtell <[email protected] > <javascript:;>> > > wrote: > > >> > > >> Inline > > >> > > >>> On Wednesday, May 27, 2015, Jonathan Hsieh <[email protected] > <javascript:;>> wrote: > > >>> > > >>> On Sat, May 23, 2015 at 9:40 AM, Andrew Purtell <[email protected] > <javascript:;> > > >>> <javascript:;>> wrote: > > >>> > > >>>> Regarding performance testing: Whatever has been done on the MOB > > branch > > >>>> will be interesting data points, and, potentially encouraging, but > > >>> porting > > >>>> to branch-1 will produce a new code base. Earlier results on other > > code > > >>>> will not be applicable. We have to start over. Like I said > elsewhere, > > >> I'm > > >>>> happy to help with (re)characterizing the perf impact and > improvements > > >>>> produced by the changes. > > >>> Thank you for offer for help -- we'd appreciated it! > > >> You bet. > > >> > > >> > > >>> Although most of my it tests and perf tests results were done against > > >>> against trunk (from sept '14 and then later feb '15 -- we've been > doing > > >>> them roughly every two weeks now) Jingcheng's most recent performance > > >>> testing and fault injection testing results were actually done > against > > a > > >>> version merged/rebased on to hbase 1.0.0[1]. Though not on the most > > >> recent > > >>> branch-1, would this be close enough and sufficient or would you > still > > >> want > > >>> to redoing them? > > >> > > >> > > >> Closer, yes. > > >> > > >> Redo on the branch-1 merge proposal would be important as a confidence > > >> builder still I believe. > > >> > > >> > > >>> > > >>> If we want to redo them when we have a 1.x backport is ready to > > propose, > > >>> we'll include the augmented ltt[2] that will make it easy to exercise > > the > > >>> mob feature's performance. > > >>> > > >>> [1] > https://github.com/cloudera/hbase/commits/cdh5-1.0.0_5.4.0?page=2 > > >>> (this is cdh5.4.0's hbase 1.0.0-based hbase) > > >>> [2] https://issues.apache.org/jira/browse/HBASE-13277 > > >>> > > >>> > > >>> What coverage do we have for verifying the integrity of MOB > references? > > >>>> Will the sweep tool detect, alert on, and optionally repair dangling > > >>>> references? (I could answer this for myself by looking at MOB > branch, > > >> but > > >>>> hopefully someone here has an answer at the ready.) I assume we > > >> calculate > > >>>> and store checksums for MOB data itself so we know if values are > > >> corrupt. > > >>>> Does the sweep tool detect MOB value corruption? Can it be repaired? > > Do > > >>> we > > >>>> have a good ops story for why HBCK is no longer sufficient on its > own, > > >>>> there's a separate tool with a whole new set of options - and a > > >>> requirement > > >>>> for a MR runtime! - for checking MOB data? That last one is a > > >> rhetorical > > >>>> question (smile), the ops story is... unsatisfying. It's like we've > > >>> taken a > > >>>> self sufficient HBase and bolted in parts of Hive, so now we need > MR. > > >>>> > > >>>> Our internal compaction detects and alerts at warn level if there > is a > > >>> missing link [3], and then returns a empty value [4] > > >> > > >> > > >> Ok, thanks > > >> > > >> > > >>> Mobs are stored in hfiles so we have the same checksumming all other > > >> hfiles > > >>> have. > > >> > > >> > > >> Ok, thanks > > >> > > >> > > >>> > > >>> In the other response, I answered about hbck and how something like > > >>> Hfile.main() could be a more appropriate checking tool to address > this > > >>> situation. > > >> > > >> > > >> Ok. Replied there. > > >> > > >> > > >>> > > >>> I'm afraid then much of our complete operational story is > > "unsatisfying" > > >> > > >> even without mob because it still requires MR -- e.g. copytable, > export, > > >>> import, walplayer, or verifyreplicaion mr jobs. While I'll agree that > > >>> having an external system is undesirable and unacceptable for what > are > > >>> mandatory internal operations like compactions, I think requiring mr > > for > > >> a > > >>> verifiymob mr job would as acceptable as the verfiyreplication job. > > >> > > >> > > >> I think integrity checks are a different class of tool than all others > > and > > >> we shouldn't mandate the presence of a MR runtime to execute those. > > OTOH, > > >> it's reasonable to provide a standalone tool (if multithreaded) but > > >> then also a recommended MR version that can achieve better > parallelism. > > >> > > >> > > >>> > > >>> [3] > > >> > > > https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L400 > > >>> [4] > > >> > > > https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobCompactor.java#L224 > > >>> > > >>>> > > >>>>> On Fri, May 22, 2015 at 1:45 PM, Jonathan Hsieh <[email protected] > <javascript:;> > > >>>> <javascript:;>> wrote: > > >>>> > > >>>>> In another thread andrew purtell brought up some concerns about the > > >> mob > > >>>>> feature: > > >>>>> > > >>>>> On Fri, May 22, 2015 at 12:40 PM, Andrew Purtell < > > >> [email protected] <javascript:;> > > >>> <javascript:;>> > > >>>>> wrote: > > >>>>> > > >>>>>> Another point of clarification, sorry, I hit the send button too > > >>> early > > >>>> it > > >>>>>> seems: I don't believe MOB is fully integrated yet, for example > the > > >>>>>> feature > > >>>>>> is an extension to store that lacks support for encryption (this > > >>> would > > >>>>>> technically be a feature regression); and HBCK. I have not been > > >>>> following > > >>>>>> MOB too closely so could be mistaken. These issues do not preclude > > >> a > > >>>>> merge > > >>>>>> of MOB into trunk, but do preclude a merge back of MOB from trunk > > >> to > > >>>>>> branch-1. I would veto the latter until such shortcomings in the > > >>>>>> implementation that could be described as regressions are > > >> addressed. > > >>> I > > >>>>>> would also like to see a performance analysis of a range of > > >> workloads > > >>>>>> before and after in as much detail as can be mustered, and would > be > > >>>> happy > > >>>>>> to volunteer to help out with that. > > >>>>> > > >>>>> Here's info on the points brought up: > > >>>>> > > >>>>> Encryption support shortcoming is being addrsessed here: > > >>>>> https://issues.apache.org/jira/browse/HBASE-13693 (closed) > > >>>>> https://issues.apache.org/jira/browse/HBASE-13720 (in review) > > >>>>> > > >>>>> Hbck has been actually run against the integration test rigs while > > >> the > > >>>>> feature has been enabled but currently has no explicit unit test or > > >>>> simple > > >>>>> to run integration test. It currently doesn't report anything > > >> special > > >>>>> about the mob storage area. We can add unit tests that cover hbck > > >> when > > >>>> the > > >>>>> mob path is exercised. > > >>>>> > > >>>>> Another suggestion was a tool to check that mob references had > > >>>>> corresponding mob data. We currently include a mr-based sweeper > job > > >>> that > > >>>>> could be used to perform this verification. We can add this tool > and > > >>>>> testing for the tool. > > >>>>> > > >>>>> I've done some performance testing and Jingcheng and his colleagues > > >>> have > > >>>>> done significant amounts of performance testing. We currently have > a > > >>> blog > > >>>>> post in progress that will share the results of this performance > > >>> testing. > > >>>>> > > >>>>> Jon. > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Wed, May 20, 2015 at 7:38 PM, Ted Yu <[email protected] > <javascript:;> > > >>> <javascript:;>> wrote: > > >>>>> > > >>>>>> This is a useful feature, Jon. > > >>>>>> > > >>>>>> I went over the mega-patch and left some comments on review board. > > >>>>>> > > >>>>>> I noticed that hbck was not included in the patch. Neither did I > > >>> find a > > >>>>>> sub-task of HBASE-11339 that covers hbck. > > >>>>>> > > >>>>>> Do you or Jingcheng plan to add MOB-aware capability for hbck ? > > >>>>>> > > >>>>>> Cheers > > >>>>>> > > >>>>>> On Wed, May 20, 2015 at 9:21 AM, Jonathan Hsieh <[email protected] > <javascript:;> > > >>> <javascript:;>> > > >>>>> wrote: > > >>>>>> > > >>>>>>> Hi folks, > > >>>>>>> > > >>>>>>> The Medium Object (MOB) Storage feature (HBASE-11339[1]) is > > >>> modified > > >>>>> I/O > > >>>>>>> and compaction path that allows individual moderately sized > > >> values > > >>>>>>> (10k-10MB) to be stored so that write amplification is reduced > > >> when > > >>>>>>> compared to the normal I/O path. At a high level, it provides > > >>>>> alternate > > >>>>>>> flush and compaction mechanisms that segregates large cells into > > >> a > > >>>>>> separate > > >>>>>>> area where they are not subject to potentially frequent > > >> compaction > > >>>> and > > >>>>>>> splits that can be encountered in the normal I/O path. A more > > >>>> detailed > > >>>>>>> design doc can be found on the hbase-11339 jira. > > >>>>>>> > > >>>>>>> Jingcheng Du has been working on the mob feature for a while and > > >>>> Anoop, > > >>>>>> Ram > > >>>>>>> and I have been shepherding him through the design revisions and > > >>>>>>> implementation of the feature in the hbase-11339 branch.[2] > > >>>>>>> > > >>>>>>> The branch we are proposing to merge into master is compatible > > >> with > > >>>>>> HBase's > > >>>>>>> core functionality including snapshots, replication, shell > > >> support, > > >>>>>> behaves > > >>>>>>> well with table alters, bulk loads and does not require external > > >> MR > > >>>>>>> processes. It has been documented, and subject to many > > >> integration > > >>>> test > > >>>>>>> runs (ITBLL, ITAcidGuarantees, ITIngest) including fault > > >>> injection. > > >>>>>>> Performance testing of the feature shows what can be a 2x-3x > > >>>> throughput > > >>>>>>> improvement for workloads that contain mobs. These results can be > > >>>> seen > > >>>>> on > > >>>>>>> the hbase 2.0 panel discussion slides from hbasecon (once > > >>> published). > > >>>>>>> > > >>>>>>> Recently there have been some hfile encryption related > > >> shortcomings > > >>>>> that > > >>>>>> we > > >>>>>>> could address in branch or in master. > > >>>>>>> > > >>>>>>> Earlier iterations of the feature has been tested in production > > >> by > > >>>>> users > > >>>>>>> that Jingcheng has been responsible for. A version has also been > > >>>>>> deployed > > >>>>>>> at users I have been responsible for. Some of the folks from > > >>> Huawei > > >>>>>>> (ashutosh) have also been submitting the recent encryption bug > > >>>> reports > > >>>>>>> against the hbase-11339 branch so there is some evidence of usage > > >>> by > > >>>>>> them. > > >>>>>>> > > >>>>>>> The four of us (Jingcheng, Ram, Anoop and I) are satisfied with > > >>> the > > >>>>>>> feature and feel it is a good time to call a merge vote. Ive > > >>> posted > > >>>> a > > >>>>>>> megapatch version for folks who want to peruse the code. [3] > > >>>>>>> > > >>>>>>> What do you all think? > > >>>>>>> > > >>>>>>> Thanks, > > >>>>>>> Jingcheng, Jon, Ram, and Anoop. > > >>>>>>> > > >>>>>>> [1] https://issues.apache.org/jira/browse/HBASE-11339 > > >>>>>>> [2] https://github.com/apache/hbase/tree/hbase-11339 > > >>>>>>> [3] https://reviews.apache.org/r/34475/ > > >>>>>>> -- > > >>>>>>> // Jonathan Hsieh (shay) > > >>>>>>> // HBase Tech Lead, Software Engineer, Cloudera > > >>>>>>> // [email protected] <javascript:;> <javascript:;> // @jmhsieh > > >>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> // Jonathan Hsieh (shay) > > >>>>> // HBase Tech Lead, Software Engineer, Cloudera > > >>>>> // [email protected] <javascript:;> <javascript:;> // @jmhsieh > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> Best regards, > > >>>> > > >>>> - Andy > > >>>> > > >>>> Problems worthy of attack prove their worth by hitting back. - Piet > > >> Hein > > >>>> (via Tom White) > > >>> > > >>> > > >>> > > >>> -- > > >>> // Jonathan Hsieh (shay) > > >>> // HBase Tech Lead, Software Engineer, Cloudera > > >>> // [email protected] <javascript:;> <javascript:;> // @jmhsieh > > >> > > >> > > >> -- > > >> Best regards, > > >> > > >> - Andy > > >> > > >> Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > >> (via Tom White) > > >> > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
