On Mon, Jan 25, 2016 at 12:14 PM, Josh Elser <[email protected]> wrote:
> I've long be waffling about the usefulness of our "infinite retry" logic. > It's great for daemons. It sucks for humans. > > Maybe there's a story in addressing this via ClientConfiguration -- let > the user tell us the policy they want to follow. +1 for configurable retry policy. Curator has a configurable retry policy. Would be good to see how it works when designing something for Accumulo. > > > John Vines wrote: > >> Of course, it's when I hit send that I realize that we could mitigate by >> making the client aware of the master state, and if the system is shut >> down >> (which was the case for that ticket), then it can fail quickly with a >> descriptive message. >> >> On Mon, Jan 25, 2016 at 10:58 AM John Vines<[email protected]> wrote: >> >> While we want to be fault tolerant, there's a point where we want to >>> eventually fail. I know we have a couple never ending retry loops that >>> need >>> to be addressed (https://issues.apache.org/jira/browse/ACCUMULO-1268), >>> but I'm unsure if queries suffer from this problem. >>> >>> Unfortunately, fault tolerance is a bit at odds with instant notification >>> of system issues, since some of the fault tolerance is temporally >>> oriented. >>> And that ticket lacks context of it never failing out vs. failing out >>> eventually (but too long for the user) >>> >>> >>> On Sun, Jan 24, 2016 at 7:46 PM Christopher<[email protected]> wrote: >>> >>> I saw this bug report: >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1300987 >>>> >>>> As far as I can tell, they are reporting normal, expected, and desired >>>> behavior of Accumulo as a bug. But, is there something we can do >>>> upstream >>>> to enable fast failures in the case of Accumulo not running to support >>>> their use case? >>>> >>>> Personally, I don't see how we can reliably detect within the client >>>> that >>>> the cluster is down or up, vs. a normal temporary server >>>> outage/migration, >>>> since there is there is no single point of authority for Accumulo to >>>> determine its overall operating status if ZooKeeper is running and no >>>> other >>>> servers are. Am I wrong? >>>> >>>> >>
