That's definitely difficult to encapsulate. You can get a mixing of
Apache conventions, your OS of choice conventions, and vendor conventions.
The thing that I usually see is applications installed under /usr/lib
(e.g. /usr/lib/accumulo), the relevant executables linked to /usr/bin to
get them on the $PATH, /usr/lib/accumulo/conf linked to
/etc/accumulo/conf (with indirection for your configuration mgmt tool of
choice -- e.g. alternatives on RHEL-based systems), and
/usr/lib/accumulo/logs linked to /var/log/accumulo.
Typically you would want all of that chown'ed by the accumulo user. Be
sure to chmod /etc/accumulo/conf and /var/log/accumulo 700 to not
inadvertently leak any sensitive configuration or log messages. The rest
of /usr/lib/accumulo can probably just be 755/644 for dirs/files,
respectively.
This also can map well to development machines if you use /usr/local/lib
for installations and avoid symlink'ing out to /etc and /var.
On 3/19/14, 2:32 PM, Benjamin Parrish wrote:
Finally got it. Came down to a user and ownership issue with my Hadoop,
ZooKeeper, Accumulo. Does anyone have a Knowledgebase for this info
that lays out a standard of what users should be created, where folders
should be created, permissions, ownership, etc. I feel like that would
be invaluable information for a newbie setting items up... or maybe I
just tried to jump into something head first without being a Linux power
user.
On Wed, Mar 19, 2014 at 10:04 AM, Sean Busbey <[email protected]
<mailto:[email protected]>> wrote:
Also, on the off chance that some other part of your system is
exporting an incorrect HADOOP_CONF_DIR, you should still run this
confirmation step from earlier:
> You can verify this by doing
>
> ssh ${HOST} "bash -c 'echo ${HADOOP_CONF_DIR:-no hadoop conf}'"
>
> as the accumulo user on the master server, where HOST is the tserver.
For the accumulo-env.sh you posted to impact things, the above
command would have to result in the output "no hadoop conf".
-Sean
On Wed, Mar 19, 2014 at 9:01 AM, Sean Busbey
<[email protected] <mailto:[email protected]>> wrote:
Josh is correct about the behavior of the ZooKeeper cli. As an
aside, how big is this cluster? Five ZooKeeper servers shouldn't
be needed until you get past ~100 nodes, unless you're just
going for more fault tolerance.
Could you update your gist with the changes to accumulo-env.sh?
It's much easier to follow there.
Could you also re-run the accumulo classpath command on the
tserver after updating accumulo-env.sh across the cluster?
We already know the proximal cause of the failure: the Hadoop
configuration files aren't ending up in the classpath for the
tabletservers. I can think of a few things that would show this
in the tablet server logs, but you've already described hitting
most of them.
-Sean
On Wed, Mar 19, 2014 at 8:27 AM, Benjamin Parrish
<[email protected]
<mailto:[email protected]>> wrote:
So, I am back to no clue now...
On Wed, Mar 19, 2014 at 9:13 AM, Josh Elser
<[email protected] <mailto:[email protected]>> wrote:
I think by default zkCli.sh will just try to connect to
localhost. You can change this by providing the quorum
string to the script with the -server option.
On Mar 19, 2014 8:29 AM, "Benjamin Parrish"
<[email protected]
<mailto:[email protected]>> wrote:
I adjusted accumulo-env.sh to have hard coded values
as seen below.
Are there any logs that could shed some light on
this issue?
If it also helps I am using CentOS 6.5, Hadoop
2.2.0, ZooKeeper 3.4.6.
I also ran across this, that didn't look right...
Welcome to ZooKeeper!
2014-03-19 08:25:53,479 [myid:] - INFO
[main-SendThread(localhost:2181):ClientCnxn$SendThread@975] -
Opening socket connection to server localhost/127.0.0.1:2181
<http://127.0.0.1:2181>. Will not attempt to authenticat
e using SASL (unknown error)
2014-03-19 08:25:53,483 [myid:] - INFO
[main-SendThread(localhost:2181):ClientCnxn$SendThread@852] -
Socket connection established to localhost/127.0.0.1:2181
<http://127.0.0.1:2181>, initiating session
JLine support is enabled
[zk: localhost:2181(CONNECTING) 0] 2014-03-19
08:25:53,523 [myid:] - INFO
[main-SendThread(localhost:2181):ClientCnxn$SendThread@1235] - Session
establishment complete on server localhost/127.0.
0.1:2181, sessionid = 0x144da4e00d90000, negotiated
timeout = 30000
should ZooKeeper try to hit localhost/127.0.0.1
<http://127.0.0.1>?
my zoo.cfg looks like this....
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/data
clientPort=2181
server.1=hadoop-node-1:2888:3888
server.2=hadoop-node-2:2888:3888
server.3=hadoop-node-3:2888:3888
server.4=hadoop-node-4:2888:3888
server.5=hadoop-node-5:2888:3888
--
Benjamin D. Parrish
H: 540-597-7860