> I propose changing the default value of `hbase.tmp.dir` as shipped in the > default hbase-site.xml to be simply `tmp`, as I documented in my change on > HBASE-24106. That way it's not hidden somewhere and it's self-contained to > this unpacking of the source/binary distribution.
+1, great choice On Wed, Apr 15, 2020 at 10:03 AM Nick Dimiduk <ndimi...@gmail.com> wrote: > Branching off this subject from the original thread. > > On Wed, Apr 15, 2020 at 9:56 AM Andrew Purtell <andrew.purt...@gmail.com> > wrote: > > > Quick Start and Production are exclusive configurations. > > > > Yes absolutely. > > Quick Start, as you say, should have as few steps to up and running as > > possible. > > > > Production requires a real distributed filesystem for persistence and > that > > means HDFS and that means, whatever the provisioning and deployment and > > process management (Ambari or k8s or...) choices are going to be, they > will > > not be a Quick Start. > > > > We’ve always had this problem. The Quick Start simply can’t produce a > > system capable of durability because prerequisites for durability are not > > quick to set up. > > > > Is this exclusively due to the implementation of `LocalFileSystem` or are > there other issues at play? I've seen there's also `RawLocalFileSystem` but > I haven't investigated their relationship, it's capabilities, or if we > might profit from its use for the Quick Start experience. > > Specifically about /tmp... I agree that’s not a good default. Time and > > again I’ve heard people complain that the tmp cleaner has removed their > > test data. It shouldn’t be surprising but is and that is real feedback on > > mismatch of user expectation to what we are providing in that > > configuration. Addressing this aspect of the Quick Start experience would > > be a simple change: make the default a new directory in $HOME, perhaps > > “.hbase” . > > > > I propose changing the default value of `hbase.tmp.dir` as shipped in the > default hbase-site.xml to be simply `tmp`, as I documented in my change on > HBASE-24106. That way it's not hidden somewhere and it's self-contained to > this unpacking of the source/binary distribution. I.e., there's no need to > worry about upgrading the data stored there when a user experiments with a > new version. > > > On Apr 15, 2020, at 9:40 AM, Nick Dimiduk <ndimi...@apache.org> wrote: > > > > > > Hi folks, > > > > > > I'd like to bring up the topic of the experience of new users as it > > > pertains to use of the `LocalFileSystem` and its associated (lack of) > > data > > > durability guarantees. By default, an unconfigured HBase runs with its > > root > > > directory on a `file:///` path. This patch is picked up as an instance > of > > > `LocalFileSystem`. Hadoop has long offered this class, but it has never > > > supported `hsync` or `hflush` stream characteristics. Thus, when HBase > > runs > > > on this configuration, it is unable to ensure that WAL writes are > > durable, > > > and thus will ACK a write without this assurance. This is the case, > even > > > when running in a fully durable WAL mode. > > > > > > This impacts a new user, someone kicking the tires on HBase following > our > > > Getting Started docs. On Hadoop 2.8 and before, an unconfigured HBase > > will > > > WARN and cary on. Hadoop 2.10+, HBase will refuse to start. The book > > > describes a process of disabling enforcement of stream capability > > > enforcement as a first step. This is a mandatory configuration for > > running > > > HBase directly out of our binary distribution. > > > > > > HBASE-24086 restores the behavior on Hadoop 2.10+ to that of running on > > > 2.8: log a warning and cary on. The critique of this approach is that > > it's > > > far too subtle, too quiet for a system operating in a state known to > not > > > provide data durability. > > > > > > I have two assumptions/concerns around the state of things, which > > prompted > > > my solution on HBASE-24086 and the associated doc update on > HBASE-24106. > > > > > > 1. No one should be running a production system on `LocalFileSystem`. > > > > > > The initial implementation checked both for `LocalFileSystem` and > > > `hbase.cluster.distributed`. When running on the former and the latter > is > > > false, we assume the user is running a non-production deployment and > > carry > > > on with the warning. When the latter is true, we assume the user > > intended a > > > production deployment and the process terminates due to stream > capability > > > enforcement. Subsequent code review resulted in skipping the > > > `hbase.cluster.distributed` check and simply warning, as was done on > 2.8 > > > and earlier. > > > > > > (As I understand it, we've long used the `hbase.cluster.distributed` > > > configuration to decide if the user intends this runtime to be a > > production > > > deployment or not.) > > > > > > Is this a faulty assumption? Is there a use-case we support where we > > > condone running production deployment on the non-durable > > `LocalFileSystem`? > > > > > > 2. The Quick Start experience should require no configuration at all. > > > > > > Our stack is difficult enough to run in a fully durable production > > > environment. We should make it a priority to ensure it's as easy as > > > possible to try out HBase. Forcing a user to make decisions about data > > > durability before they even launch the web ui is a terrible experience, > > in > > > my opinion, and should be a non-starter for us as a project. > > > > > > (In my opinion, the need to configure either `hbase.rootdir` or > > > `hbase.tmp.dir` away from `/tmp` is equally bad for a Getting Started > > > experience. It is a second, more subtle question of data durability > that > > we > > > should avoid out of the box. But I'm happy to leave that for another > > > thread.) > > > > > > Thank you for your time, > > > Nick > > > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk