On Fri, Jun 1, 2018 at 11:50 AM, Mike Drob <[email protected]> wrote:
> On Fri, Jun 1, 2018 at 12:01 PM, Stack <[email protected]> wrote: > > > On Fri, Jun 1, 2018 at 9:36 AM, Mike Drob <[email protected]> wrote: > > I'm working on untangling this mess, but I just got lost in the weeds of > the argument on HBASE-6868. > > I have to assume that this concern over double checksumming, or missing > checksums on remote files, or whatever else is going on in that issue only > applies to truly ancient versions of Hadoop at this point? I don't think so. Skimming that issue, hbase versions are discussed, not Hadoop versions. What you seem to be trying to sort out is hbase configs/doc around what we ask of HDFS (and SCR) regards checksumming and when. HBASE-6868 was about our checksumming different dependent on whether WAL or HFile; we were inconsistent. It is always possible to double-checksum. Default shouldn't be doing this though (at least such was case last time I looked). > Do we think it's > safe to say that if SCR are enabled, we always want to enable HBase > checksums and skip HDFS checksums? That's what the docs appear to > recommend, but the code approaches it in the converse perspective: > Probably best to set up a rig and verify. You'll then have confidence making doc and code change. I have not looked at this stuff in years other than a recent attempt at underlining importance of enabling SCR; I tried to codify my understanding from back then in doc (but only seem to have confuse). Thanks Michael, S If HBase checksumming is enabled, we set dfs.c.r.sc.skip.checksum to true > and fs.setVerifyChecksum(false) in HFileSystem. User doesn't even have > option to override that. HBase checksumming is on by default, so we don't > need to mention any of this in the docs, or we can mention turning on hbase > xsum and turning off dfs xsum and then clarify that none of this is > actionable. > > > > > > 2) > > > > > > Docs claim: dfs.client.read.shortcircuit.buffer.size = 131072 > Important > > to > > > avoid OOME — hbase has a default it uses if unset, see > > > hbase.dfs.client.read.shortcircuit.buffer.size; its default is 131072. > > > > > > This is very confusing, we should set the property to some value, > because > > > if it's unset then we will use... the same value? This reads like > > needless > > > operator burden. > > > > > > Looking at the code, the default we really use is 64 * 1024 * 2 = > 126976, > > > which is actually close, but off by enough to give me pause. > > > > > > The default HDFS value is 1024 * 1024, which suggests that they're > > > expecting a value in the MB range and we're giving one in the KB range? > > > See: > > > https://github.com/apache/hadoop/blob/master/hadoop- > > > hdfs-project/hadoop-hdfs-client/src/main/java/org/ > > > apache/hadoop/hdfs/client/HdfsClientConfigKeys.java#L146-L147 > > > > > > Just now, I'm realizing that the initial comment in the docs might mean > > to > > > tune it way down to avoid OOME, my initial reading was that we need to > > > increase the ceiling from whatever default setting comes in via HDFS. > > Would > > > be good to clarify this, and also figure out what units the value is > in. > > > > > > > > Agree. > > > > IIRC, intent was to set it way-down from usual default because hbase runs > > w/ many more open files than your typical HDFS client does. > > > > Ok, we can update docs to clarify that this is a value in bytes, the > default HDFS value is 1MB, our default value is 128KB and that the total > memory used will be the buffer size * number of file handles. What's a > reasonable first order approximation for number of files per RS that will > be affected by SCR? Hosted Regions * Columns? Doesn't need code change, I > think, but the rec for 131072 should be removed. > > > > > > > > > > 3) > > > > > > Docs suggest: Ensure data locality. In hbase-site.xml, set > > > hbase.hstore.min.locality.to.skip.major.compact = 0.7 (Meaning that > 0.7 > > <= > > > n <= 1) > > > > > > I can't find anything else about this property in the docs. Digging > > through > > > the code, I find an oblique reference to HBASE-11195, but there's no RN > > > there or docs from there, and reading the issue doesn't help me > > understand > > > how this operates either. It looks like there was follow on work done, > > but > > > it would be useful to know how we arrived at 0.7 (seems arbitrary) and > > how > > > an operator could figure out if that setting is good for them or needs > to > > > slide higher/lower. > > > > > > > > I don't know anything of the above. > > > > Will save this for later then. > > > > Thanks, > > S > > > > > > > > > > > > Thanks, > > > Mike > > > > > >
