Hi folks, I was going through our docs looking at SCR set up and had some confusion. Asking here before filing JIRA issues. After writing this, I'm realizing the length got a bit out of hand. I don't want to split this into several threads because I think the information is all related, but may have to do that if a single one becomes difficult to follow.
The main docs link: http://hbase.apache.org/book.html#shortcircuit.reads 1) Docs claim: dfs.client.read.shortcircuit.skip.checksum = true so we don’t double checksum (HBase does its own checksumming to save on i/os. See hbase.regionserver.checksum.verify for more on this. Code claims: https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/util/CommonFSUtils.java#L784-L788 That if this property is set, then we log a warning? Unrelated, this is duplicated between CommonFSUtils and FSUtils, will need a jira to clean that up later. Also, there's a comment in https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L689-L690 that claims we automatically disable it, which we do in the HFileSystem constructor by setting the same dfs property in our conf to true. So I'm confused if we should be setting the property like the docs claim, not setting it like FSUtils warns, or ignoring it and letting RS auto-set it. Also unrelated, there is a check in HFileSystem from HBASE-5885 for what I think is HADOOP-9307, but we should be able to simplify some of that logic now. 2) Docs claim: dfs.client.read.shortcircuit.buffer.size = 131072 Important to avoid OOME — hbase has a default it uses if unset, see hbase.dfs.client.read.shortcircuit.buffer.size; its default is 131072. This is very confusing, we should set the property to some value, because if it's unset then we will use... the same value? This reads like needless operator burden. Looking at the code, the default we really use is 64 * 1024 * 2 = 126976, which is actually close, but off by enough to give me pause. The default HDFS value is 1024 * 1024, which suggests that they're expecting a value in the MB range and we're giving one in the KB range? See: https://github.com/apache/hadoop/blob/master/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java#L146-L147 Just now, I'm realizing that the initial comment in the docs might mean to tune it way down to avoid OOME, my initial reading was that we need to increase the ceiling from whatever default setting comes in via HDFS. Would be good to clarify this, and also figure out what units the value is in. 3) Docs suggest: Ensure data locality. In hbase-site.xml, set hbase.hstore.min.locality.to.skip.major.compact = 0.7 (Meaning that 0.7 <= n <= 1) I can't find anything else about this property in the docs. Digging through the code, I find an oblique reference to HBASE-11195, but there's no RN there or docs from there, and reading the issue doesn't help me understand how this operates either. It looks like there was follow on work done, but it would be useful to know how we arrived at 0.7 (seems arbitrary) and how an operator could figure out if that setting is good for them or needs to slide higher/lower. Thanks, Mike
