Turns out I am hitting a bug (as far as I can tell) that is fixed in
trunk:

http://svn.apache.org/viewvc/accumulo/trunk/server/src/main/java/org/apache/accumulo/server/Accumulo.java?r1=1437810&r2=1438335&diff_format=h

I believe the prior code is trying to talk to Zookeeper before it
has called Accumulo.waitForZookeeperAndHdfs().  No?

Seems this was fixed for a totally different reason (ACCUMULO-928,
which doesn't talk about what I am talking about specifically)
and unknowingly addressed another bug.

On 2/27/2013 11:38 AM, Jeff Blaine wrote:
On 2/27/2013 9:48 AM, Eric Newton wrote:
Sure, just create a ticket and submit a patch, if you're up to it.

I'm not afraid of contributing here. I do elsewhere. Unfortunately,
a patch + custom build for our project is not allowed. The effort
requires unpatched 1.4.2, so I figured I'd ask.

It looks like (after browsing the source a bit), I will have to
figure out a workaround.

If I'm feeling saucy and can make sense of it, I will try to
follow up with a patch some day so that this is addressed for
the next person. I'm not a Java developer, but do comprehend
most of the basic code I've seen in the repo based on previous
Java reading.

On Wed, Feb 27, 2013 at 9:38 AM, Jeff Blaine <[email protected]
<mailto:[email protected]>> wrote:

    We spin up our slaves first, then the Hadoop namenode + Accumulo
    master afterward. By the time that comes up, tserver has exited
    on the Accumulo slaves.

    Is there a way to increase the tserver effort for connection
    to the master?

    Aside: is there really any reason for tserver to exit? Why not
    just do an exponential backoff of connection attempts until
    30 seconds is hit, and then stay there infinitely?


Reply via email to