Turns out I am hitting a bug (as far as I can tell) that is fixed in
trunk:
http://svn.apache.org/viewvc/accumulo/trunk/server/src/main/java/org/apache/accumulo/server/Accumulo.java?r1=1437810&r2=1438335&diff_format=h
I believe the prior code is trying to talk to Zookeeper before it
has called Accumulo.waitForZookeeperAndHdfs(). No?
Seems this was fixed for a totally different reason (ACCUMULO-928,
which doesn't talk about what I am talking about specifically)
and unknowingly addressed another bug.
On 2/27/2013 11:38 AM, Jeff Blaine wrote:
On 2/27/2013 9:48 AM, Eric Newton wrote:
Sure, just create a ticket and submit a patch, if you're up to it.
I'm not afraid of contributing here. I do elsewhere. Unfortunately,
a patch + custom build for our project is not allowed. The effort
requires unpatched 1.4.2, so I figured I'd ask.
It looks like (after browsing the source a bit), I will have to
figure out a workaround.
If I'm feeling saucy and can make sense of it, I will try to
follow up with a patch some day so that this is addressed for
the next person. I'm not a Java developer, but do comprehend
most of the basic code I've seen in the repo based on previous
Java reading.
On Wed, Feb 27, 2013 at 9:38 AM, Jeff Blaine <[email protected]
<mailto:[email protected]>> wrote:
We spin up our slaves first, then the Hadoop namenode + Accumulo
master afterward. By the time that comes up, tserver has exited
on the Accumulo slaves.
Is there a way to increase the tserver effort for connection
to the master?
Aside: is there really any reason for tserver to exit? Why not
just do an exponential backoff of connection attempts until
30 seconds is hit, and then stay there infinitely?