I have no concerns about MOB in trunk. Go for it. I do have concerns about a subsequent proposal to put it in 1.2. Those concerns center around stability and performance impacts, and a possible dependency on a MR runtime for what I would consider core function.
> Regarding the tools and integrity checks, MOB has a tool based on MR basically for sweeping and compaction apart from the compactor that runs in the core (without MR dependency). So is a MR runtime required for MOB or not? I read maybe, then no, then here maybe again. What happens if one does not have a MR runtime and therefore can never run the sweeper tool? An incomplete feature on trunk isn't a problem. Later commits can fill in the gaps and then the sum of MOB commits can go back to branch-1. (Experimental != incomplete, IMHO.) If as you say stability and performance testing have already be done and both look great, then that means *when* this is done again for a branch-1 merge candidate, the results will likely also be good. I'd like to help out with this. You won't need to prove it, I will do the legwork for my own concerns. > On May 27, 2015, at 8:59 PM, ramkrishna vasudevan > <[email protected]> wrote: > > Chiming late here, > > As Matt suggested earlier, utmost care had been taken to ensure that the > MOB code does not interfere with the normal flow and ensured that things > work normally when MOB is not enabled on a family. > > So the entire flow for MOB can be treated as an experimental feature, if > need be. Take the latest case of guys from Huawei, since they have some > interest in this feature they are trying the branch hbase-11339 and trying > to see how MOB works. > > If we move this to trunk, then chances of even more people looking into it > and by the time it comes to 1.3 or1.4 we are stable enough. > > Regarding the tools and integrity checks, > MOB has a tool based on MR basically for sweeping and compaction apart from > the compactor that runs in the core (without MR dependency). We could > always add feature to the existing tool to do integrity checks like Jon > suggests. > > .Also for an experimental feature we could always come up with such a tool, > but in case of MOB the inter dependency on the MOB and actual HFile data is > more so just a stand alone too to check integrity on the Hfile may not be > easy without having to do some sort of scan on the Hfiles and MOB files. > (Not thought on that fully). > > I would still think that having this feature as experimental in 1.2 makes > sense. Just my thoughts on this also after being part on the dev process > for this feature where we tried not to touch the core areas affecting non > MOB cases. > > Some of the perf results performed by Jingcheng's team and Cloudera folks > substantiates the gain this feature provides. > > Regards > Ram > > > > >> On Thu, May 28, 2015 at 9:04 AM, Andrew Purtell <[email protected]> wrote: >> >> Inline >> >>> On Wednesday, May 27, 2015, Jonathan Hsieh <[email protected]> wrote: >>> >>> On Sat, May 23, 2015 at 9:40 AM, Andrew Purtell <[email protected] >>> <javascript:;>> wrote: >>> >>>> Regarding performance testing: Whatever has been done on the MOB branch >>>> will be interesting data points, and, potentially encouraging, but >>> porting >>>> to branch-1 will produce a new code base. Earlier results on other code >>>> will not be applicable. We have to start over. Like I said elsewhere, >> I'm >>>> happy to help with (re)characterizing the perf impact and improvements >>>> produced by the changes. >>> Thank you for offer for help -- we'd appreciated it! >> You bet. >> >> >>> Although most of my it tests and perf tests results were done against >>> against trunk (from sept '14 and then later feb '15 -- we've been doing >>> them roughly every two weeks now) Jingcheng's most recent performance >>> testing and fault injection testing results were actually done against a >>> version merged/rebased on to hbase 1.0.0[1]. Though not on the most >> recent >>> branch-1, would this be close enough and sufficient or would you still >> want >>> to redoing them? >> >> >> Closer, yes. >> >> Redo on the branch-1 merge proposal would be important as a confidence >> builder still I believe. >> >> >>> >>> If we want to redo them when we have a 1.x backport is ready to propose, >>> we'll include the augmented ltt[2] that will make it easy to exercise the >>> mob feature's performance. >>> >>> [1] https://github.com/cloudera/hbase/commits/cdh5-1.0.0_5.4.0?page=2 >>> (this is cdh5.4.0's hbase 1.0.0-based hbase) >>> [2] https://issues.apache.org/jira/browse/HBASE-13277 >>> >>> >>> What coverage do we have for verifying the integrity of MOB references? >>>> Will the sweep tool detect, alert on, and optionally repair dangling >>>> references? (I could answer this for myself by looking at MOB branch, >> but >>>> hopefully someone here has an answer at the ready.) I assume we >> calculate >>>> and store checksums for MOB data itself so we know if values are >> corrupt. >>>> Does the sweep tool detect MOB value corruption? Can it be repaired? Do >>> we >>>> have a good ops story for why HBCK is no longer sufficient on its own, >>>> there's a separate tool with a whole new set of options - and a >>> requirement >>>> for a MR runtime! - for checking MOB data? That last one is a >> rhetorical >>>> question (smile), the ops story is... unsatisfying. It's like we've >>> taken a >>>> self sufficient HBase and bolted in parts of Hive, so now we need MR. >>>> >>>> Our internal compaction detects and alerts at warn level if there is a >>> missing link [3], and then returns a empty value [4] >> >> >> Ok, thanks >> >> >>> Mobs are stored in hfiles so we have the same checksumming all other >> hfiles >>> have. >> >> >> Ok, thanks >> >> >>> >>> In the other response, I answered about hbck and how something like >>> Hfile.main() could be a more appropriate checking tool to address this >>> situation. >> >> >> Ok. Replied there. >> >> >>> >>> I'm afraid then much of our complete operational story is "unsatisfying" >> >> even without mob because it still requires MR -- e.g. copytable, export, >>> import, walplayer, or verifyreplicaion mr jobs. While I'll agree that >>> having an external system is undesirable and unacceptable for what are >>> mandatory internal operations like compactions, I think requiring mr for >> a >>> verifiymob mr job would as acceptable as the verfiyreplication job. >> >> >> I think integrity checks are a different class of tool than all others and >> we shouldn't mandate the presence of a MR runtime to execute those. OTOH, >> it's reasonable to provide a standalone tool (if multithreaded) but >> then also a recommended MR version that can achieve better parallelism. >> >> >>> >>> [3] >> https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L400 >>> [4] >> https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobCompactor.java#L224 >>> >>>> >>>>> On Fri, May 22, 2015 at 1:45 PM, Jonathan Hsieh <[email protected] >>>> <javascript:;>> wrote: >>>> >>>>> In another thread andrew purtell brought up some concerns about the >> mob >>>>> feature: >>>>> >>>>> On Fri, May 22, 2015 at 12:40 PM, Andrew Purtell < >> [email protected] >>> <javascript:;>> >>>>> wrote: >>>>> >>>>>> Another point of clarification, sorry, I hit the send button too >>> early >>>> it >>>>>> seems: I don't believe MOB is fully integrated yet, for example the >>>>>> feature >>>>>> is an extension to store that lacks support for encryption (this >>> would >>>>>> technically be a feature regression); and HBCK. I have not been >>>> following >>>>>> MOB too closely so could be mistaken. These issues do not preclude >> a >>>>> merge >>>>>> of MOB into trunk, but do preclude a merge back of MOB from trunk >> to >>>>>> branch-1. I would veto the latter until such shortcomings in the >>>>>> implementation that could be described as regressions are >> addressed. >>> I >>>>>> would also like to see a performance analysis of a range of >> workloads >>>>>> before and after in as much detail as can be mustered, and would be >>>> happy >>>>>> to volunteer to help out with that. >>>>> >>>>> Here's info on the points brought up: >>>>> >>>>> Encryption support shortcoming is being addrsessed here: >>>>> https://issues.apache.org/jira/browse/HBASE-13693 (closed) >>>>> https://issues.apache.org/jira/browse/HBASE-13720 (in review) >>>>> >>>>> Hbck has been actually run against the integration test rigs while >> the >>>>> feature has been enabled but currently has no explicit unit test or >>>> simple >>>>> to run integration test. It currently doesn't report anything >> special >>>>> about the mob storage area. We can add unit tests that cover hbck >> when >>>> the >>>>> mob path is exercised. >>>>> >>>>> Another suggestion was a tool to check that mob references had >>>>> corresponding mob data. We currently include a mr-based sweeper job >>> that >>>>> could be used to perform this verification. We can add this tool and >>>>> testing for the tool. >>>>> >>>>> I've done some performance testing and Jingcheng and his colleagues >>> have >>>>> done significant amounts of performance testing. We currently have a >>> blog >>>>> post in progress that will share the results of this performance >>> testing. >>>>> >>>>> Jon. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, May 20, 2015 at 7:38 PM, Ted Yu <[email protected] >>> <javascript:;>> wrote: >>>>> >>>>>> This is a useful feature, Jon. >>>>>> >>>>>> I went over the mega-patch and left some comments on review board. >>>>>> >>>>>> I noticed that hbck was not included in the patch. Neither did I >>> find a >>>>>> sub-task of HBASE-11339 that covers hbck. >>>>>> >>>>>> Do you or Jingcheng plan to add MOB-aware capability for hbck ? >>>>>> >>>>>> Cheers >>>>>> >>>>>> On Wed, May 20, 2015 at 9:21 AM, Jonathan Hsieh <[email protected] >>> <javascript:;>> >>>>> wrote: >>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> The Medium Object (MOB) Storage feature (HBASE-11339[1]) is >>> modified >>>>> I/O >>>>>>> and compaction path that allows individual moderately sized >> values >>>>>>> (10k-10MB) to be stored so that write amplification is reduced >> when >>>>>>> compared to the normal I/O path. At a high level, it provides >>>>> alternate >>>>>>> flush and compaction mechanisms that segregates large cells into >> a >>>>>> separate >>>>>>> area where they are not subject to potentially frequent >> compaction >>>> and >>>>>>> splits that can be encountered in the normal I/O path. A more >>>> detailed >>>>>>> design doc can be found on the hbase-11339 jira. >>>>>>> >>>>>>> Jingcheng Du has been working on the mob feature for a while and >>>> Anoop, >>>>>> Ram >>>>>>> and I have been shepherding him through the design revisions and >>>>>>> implementation of the feature in the hbase-11339 branch.[2] >>>>>>> >>>>>>> The branch we are proposing to merge into master is compatible >> with >>>>>> HBase's >>>>>>> core functionality including snapshots, replication, shell >> support, >>>>>> behaves >>>>>>> well with table alters, bulk loads and does not require external >> MR >>>>>>> processes. It has been documented, and subject to many >> integration >>>> test >>>>>>> runs (ITBLL, ITAcidGuarantees, ITIngest) including fault >>> injection. >>>>>>> Performance testing of the feature shows what can be a 2x-3x >>>> throughput >>>>>>> improvement for workloads that contain mobs. These results can be >>>> seen >>>>> on >>>>>>> the hbase 2.0 panel discussion slides from hbasecon (once >>> published). >>>>>>> >>>>>>> Recently there have been some hfile encryption related >> shortcomings >>>>> that >>>>>> we >>>>>>> could address in branch or in master. >>>>>>> >>>>>>> Earlier iterations of the feature has been tested in production >> by >>>>> users >>>>>>> that Jingcheng has been responsible for. A version has also been >>>>>> deployed >>>>>>> at users I have been responsible for. Some of the folks from >>> Huawei >>>>>>> (ashutosh) have also been submitting the recent encryption bug >>>> reports >>>>>>> against the hbase-11339 branch so there is some evidence of usage >>> by >>>>>> them. >>>>>>> >>>>>>> The four of us (Jingcheng, Ram, Anoop and I) are satisfied with >>> the >>>>>>> feature and feel it is a good time to call a merge vote. Ive >>> posted >>>> a >>>>>>> megapatch version for folks who want to peruse the code. [3] >>>>>>> >>>>>>> What do you all think? >>>>>>> >>>>>>> Thanks, >>>>>>> Jingcheng, Jon, Ram, and Anoop. >>>>>>> >>>>>>> [1] https://issues.apache.org/jira/browse/HBASE-11339 >>>>>>> [2] https://github.com/apache/hbase/tree/hbase-11339 >>>>>>> [3] https://reviews.apache.org/r/34475/ >>>>>>> -- >>>>>>> // Jonathan Hsieh (shay) >>>>>>> // HBase Tech Lead, Software Engineer, Cloudera >>>>>>> // [email protected] <javascript:;> // @jmhsieh >>>>> >>>>> >>>>> >>>>> -- >>>>> // Jonathan Hsieh (shay) >>>>> // HBase Tech Lead, Software Engineer, Cloudera >>>>> // [email protected] <javascript:;> // @jmhsieh >>>> >>>> >>>> >>>> -- >>>> Best regards, >>>> >>>> - Andy >>>> >>>> Problems worthy of attack prove their worth by hitting back. - Piet >> Hein >>>> (via Tom White) >>> >>> >>> >>> -- >>> // Jonathan Hsieh (shay) >>> // HBase Tech Lead, Software Engineer, Cloudera >>> // [email protected] <javascript:;> // @jmhsieh >> >> >> -- >> Best regards, >> >> - Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein >> (via Tom White) >>
