Re: Installing with Hadoop 2.2.0

Josh Elser Wed, 19 Mar 2014 11:59:37 -0700

That's definitely difficult to encapsulate. You can get a mixing ofApache conventions, your OS of choice conventions, and vendor conventions.

The thing that I usually see is applications installed under /usr/lib(e.g. /usr/lib/accumulo), the relevant executables linked to /usr/bin toget them on the $PATH, /usr/lib/accumulo/conf linked to/etc/accumulo/conf (with indirection for your configuration mgmt tool ofchoice -- e.g. alternatives on RHEL-based systems), and/usr/lib/accumulo/logs linked to /var/log/accumulo.

Typically you would want all of that chown'ed by the accumulo user. Besure to chmod /etc/accumulo/conf and /var/log/accumulo 700 to notinadvertently leak any sensitive configuration or log messages. The restof /usr/lib/accumulo can probably just be 755/644 for dirs/files,respectively.

This also can map well to development machines if you use /usr/local/libfor installations and avoid symlink'ing out to /etc and /var.


On 3/19/14, 2:32 PM, Benjamin Parrish wrote:

Finally got it.  Came down to a user and ownership issue with my Hadoop,
ZooKeeper, Accumulo.  Does anyone have a Knowledgebase for this info
that lays out a standard of what users should be created, where folders
should be created, permissions, ownership, etc.  I feel like that would
be invaluable information for a newbie setting items up... or maybe I
just tried to jump into something head first without being a Linux power
user.


On Wed, Mar 19, 2014 at 10:04 AM, Sean Busbey <[email protected]
<mailto:[email protected]>> wrote:

    Also, on the off chance that some other part of your system is
    exporting an incorrect HADOOP_CONF_DIR, you should still run this
    confirmation step from earlier:

     > You can verify this by doing
     >
     >   ssh ${HOST} "bash -c 'echo ${HADOOP_CONF_DIR:-no hadoop conf}'"
     >
     > as the accumulo user on the master server, where HOST is the tserver.

    For the accumulo-env.sh you posted to impact things, the above
    command would have to result in the output "no hadoop conf".

    -Sean


    On Wed, Mar 19, 2014 at 9:01 AM, Sean Busbey
    <[email protected] <mailto:[email protected]>> wrote:

        Josh is correct about the behavior of the ZooKeeper cli. As an
        aside, how big is this cluster? Five ZooKeeper servers shouldn't
        be needed until you get past ~100 nodes, unless you're just
        going for more fault tolerance.

        Could you update your gist with the changes to accumulo-env.sh?
        It's much easier to follow there.

        Could you also re-run the accumulo classpath command on the
        tserver after updating accumulo-env.sh across the cluster?

        We already know the proximal cause of the failure: the Hadoop
        configuration files aren't ending up in the classpath for the
        tabletservers. I can think of a few things that would show this
        in the tablet server logs, but you've already described hitting
        most of them.

        -Sean



        On Wed, Mar 19, 2014 at 8:27 AM, Benjamin Parrish
        <[email protected]
        <mailto:[email protected]>> wrote:

            So, I am back to no clue now...


            On Wed, Mar 19, 2014 at 9:13 AM, Josh Elser
            <[email protected] <mailto:[email protected]>> wrote:

                I think by default zkCli.sh will just try to connect to
                localhost. You can change this by providing the quorum
                string to the script with the -server option.

                On Mar 19, 2014 8:29 AM, "Benjamin Parrish"
                <[email protected]
                <mailto:[email protected]>> wrote:

                    I adjusted accumulo-env.sh to have hard coded values
                    as seen below.

                    Are there any logs that could shed some light on
                    this issue?

                    If it also helps I am using CentOS 6.5, Hadoop
                    2.2.0, ZooKeeper 3.4.6.

                    I also ran across this, that didn't look right...

                    Welcome to ZooKeeper!
                    2014-03-19 08:25:53,479 [myid:] - INFO
                      [main-SendThread(localhost:2181):ClientCnxn$SendThread@975] - 
Opening socket connection to server localhost/127.0.0.1:2181 
<http://127.0.0.1:2181>. Will not attempt to authenticat
                    e using SASL (unknown error)
                    2014-03-19 08:25:53,483 [myid:] - INFO
                      [main-SendThread(localhost:2181):ClientCnxn$SendThread@852] - 
Socket connection established to localhost/127.0.0.1:2181 
<http://127.0.0.1:2181>, initiating session
                    JLine support is enabled
                    [zk: localhost:2181(CONNECTING) 0] 2014-03-19
                    08:25:53,523 [myid:] - INFO
                      
[main-SendThread(localhost:2181):ClientCnxn$SendThread@1235] - Session 
establishment complete on server localhost/127.0.
                    0.1:2181, sessionid = 0x144da4e00d90000, negotiated
                    timeout = 30000

                    should ZooKeeper try to hit localhost/127.0.0.1
                    <http://127.0.0.1>?

                    my zoo.cfg looks like this....
                    tickTime=2000
                    initLimit=10
                    syncLimit=5
                    dataDir=/usr/local/zookeeper/data
                    clientPort=2181
                    server.1=hadoop-node-1:2888:3888
                    server.2=hadoop-node-2:2888:3888
                    server.3=hadoop-node-3:2888:3888
                    server.4=hadoop-node-4:2888:3888
                    server.5=hadoop-node-5:2888:3888






--
Benjamin D. Parrish
H: 540-597-7860

Re: Installing with Hadoop 2.2.0

Reply via email to