On Wed, Jul 8, 2015 at 10:23 AM, Stack <[email protected]> wrote: > On Wed, Jul 8, 2015 at 7:53 AM, Sean Busbey <[email protected]> wrote: > > > Hi Folks! > > > > For the 1.2 release, I think the consensus is to disable Distributed Log > > Replay by default due to lack of sufficient testing. At least, that's the > > only feedback I've heard so far. :) > > > > > > Anyone object to that? > > > > > I've been trying it over the last few days. It is easy enough to lose > data: HBASE-14028. It is a bit tough tracing how the loss is happening > given more moving parts and that it seems few have treaded this route > previously; breadcrumbs are sparse (fixing). > > I'll keep at this until DLR in 1.2 is for sure a lost cause. > > On DLR: > > + DLR is a little more involved than DLS -- which is already tough enough > to follow. It might be best to just punt and come back here after assign > has been redone (and simplified) on top of pv2; hbase-2.0.0? >
Agreed. DLR is a very good idea, but unfortunately has not stabilized enough. The recovery semantics, zk interactions, assignment, etc make it very complex to understand and operate. I would vote for not doing any more work on this side unless we have solved the assignment process. The other problem is that we cannot have only DLR since if the table is offline DLS is needed, which forces us to maintain and test two different subsystems. In the long term, we should be shooting for a simplified solution. Let's disable in master as well. Once / if we have better testing we can always re-enable it. Enis > + It can actually make for a worse MTTR as it does not do re-lookups during > replay of a WAL if the target server crashes during DLR; the whole WAL > replay must timeout before we'll go re-find the new location (30 seconds at > least). > > St.Ack > > > Presuming no one does,what do folks think about just disabling it by > > default in the current branch-1? > > > > > > That isn't to say it couldn't switch to on-by-default at a latter 1.y > > release. It's just that we had to turn it off right before the 1.1 > releases > > as well, and I'd prefer we avoid these last minute changes in favor of > > waiting until someone has the time to prioritize thorough testing. > > > > > > > > > -- > > Sean > > >
