if you look at line 116 of the namenode log, it took ~3 seconds for it to get through start up, out of safe mode, and ready to deal with RPC.
if the gc / master / whatever is trying to grab things prior to that, it would explain the failure you're seeing. What's the last modified timestamp on that GC log? On Wed, Jul 9, 2014 at 11:43 AM, David Medinets <[email protected]> wrote: > I see the same connection refused message in master, monitor, tserver, and > gc logs. Maybe it's a coordination issue where one process takes a bit of > time to fully start? > > GIST of ./make_image.sh > https://gist.github.com/medined/c1677c278bd72cd2ee64 > > $ ./make_container.sh grail grail > db73a147ef89a683435475ac9fcba7ef31219710e39759afd87b0742f6cc7490 > > GIST of file diff from base image > https://gist.github.com/medined/fe7d04bd4aaa902a66ed > > Use ./enter_image.sh grail to investigate container > > GIST of namenode log > https://gist.github.com/medined/46ce499603d75888a118 > > GIST of datanode log > https://gist.github.com/medined/1e99bb6db3466427013d > > GIST of accumulo gc log > https://gist.github.com/medined/c1abf7eaee876057e38f > > > > > On Wed, Jul 9, 2014 at 12:29 PM, Sean Busbey <[email protected]> wrote: > > > can you put your namenode, datanode, and gc logs into a gist? > > > > > > On Wed, Jul 9, 2014 at 11:20 AM, David Medinets < > [email protected]> > > wrote: > > > > > I updated my accumulo-env.sh so that Accumulo uses IPV4. And now the > > shell > > > starts, I can create tables and insert entries. I can even view the > > monitor > > > page. However, the logs still show the connection refused error. > > > > > > > > > > > > > > > On Wed, Jul 9, 2014 at 10:38 AM, Sean Busbey <[email protected]> > > wrote: > > > > > > > This might be the same ipv6 issue that's causing your monitor > failures. > > > > > > > > -- > > > > Sean > > > > On Jul 9, 2014 9:13 AM, "David Medinets" <[email protected]> > > > wrote: > > > > > > > > > I'm hoping someone has a few minutes to help debug this networking > > > issue. > > > > > I'm running a single-node Accumulo 1.5.1 instance (using Hadoop > 2.4) > > > > inside > > > > > Docker. I can find the Accumulo instance id manually: > > > > > > > > > > -bash-4.1# hdfs dfs -ls /accumulo/instance_id > > > > > Found 1 items > > > > > -rw-r--r-- 1 accumulo accumulo 0 2014-07-09 08:22 > > > > > /accumulo/instance_id/9421cd33-5f37-4f6d-b645-372feb431cae > > > > > > > > > > But when the gc tries to find the instance id, I see this message > in > > > the > > > > > log file: > > > > > > > > > > -bash-4.1# cat > > > > > /var/log/supervisor/accumulo-gc-stdout---supervisor-nwDCZU.log > > > > > 2014-07-09 09:04:07,006 [client.ZooKeeperInstance] ERROR: Problem > > > reading > > > > > instance id out of hdfs at /accumulo/instance_id > > > > > java.net.ConnectException: Call From grail/172.17.0.2 to > grail:8020 > > > > failed > > > > > on connection exception: java.net.ConnectException: Connection > > refused; > > > > > > > > > > The hostname is 'grail' which resolves to 172.17.0.2 via /etc/hosts > > and > > > > it > > > > > can ping itself: > > > > > > > > > > david@zareason-verix545:~/projects/docker-builds/accumulo$ > > > > > ./enter_image.sh > > > > > grail > > > > > -bash-4.1# ping grail > > > > > PING grail (172.17.0.2) 56(84) bytes of data. > > > > > 64 bytes from grail (172.17.0.2): icmp_seq=1 ttl=64 time=0.083 ms > > > > > 64 bytes from grail (172.17.0.2): icmp_seq=2 ttl=64 time=0.054 ms > > > > > > > > > > Do I need to use IP addresses in the 'gc' and other configuration > > > files? > > > > > > > > > > Any ideas for me to try? > > > > > > > > > > When the NameNode starts, the log file has no errors. It starts > with: > > > > > > > > > > STARTUP_MSG: Starting NameNode > > > > > STARTUP_MSG: host = grail/172.17.0.2 > > > > > > > > > > > > > > > > > > > > -- > > Sean > > > -- Sean
