instance_id

Ott, Charles H. Thu, 16 Jan 2014 11:51:37 -0800

Disclaimer,


Not advocating this is the best approach, just what I’m currently doing, put 
this together pretty quick, but it should be mostly complete for settting up 
accumulo on cdh hdfs/zk

 

 

 

I always do something like this first on CentOS:

 

$ yum install –y ntpd openssh-clients unzip

 

#setup ssh and ntpd as needed

 

$install the jdk RPM

 

# bash this to setup OS specifics

 

echo "Disabling SELINUX for Optimal CDH Compatability..."

sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

 

echo "Increasing uLimit, aka File Descripter/Handlers for all Users..."

echo "# Adding Support for CDH" >> /etc/security/limits.conf

echo "*              -     nofile              65536" >> 
/etc/security/limits.conf

 

echo "Disabling IPv6..."

echo "# Disable ipv6" >> /etc/sysctl.conf

echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf

echo "net.ipv6.conf.default.disable_ipv6 = 1" >> /etc/sysctl.conf

 

echo "Increasing Swapiness Factor to limit use of swap space."

echo "# swappiness for accumulo" >> /etc/sysctl.conf

echo "vm.swappiness = 10" >> /etc/sysctl.conf

 

 

reboot and test OS/services/jdk version…

 

then I usually extract Accumulo to /opt/Accumulo/accumulo-1.5.0

 

Make a sym link, /opt/accumulo/Accumulo-current -> ./accumulo-1.5.0

 

#make dirs. For Accumulo logs, where ever…

mkdir /var/log/accumulo

 

#let HDFS own all your Accumulo folders

chown –R hdfs:hdfs /opt/accumulo

chown –R hdfs:hdfs /var/log/accumulo

 

#update the hdfs password for the next step

user root: passwd hdfs

 

#setup passwordless ssh (test using hdfs afterwards, should be able to ssh 
<node> w/o entering credentials)

su –hdfs

ssh-copy-id <for all tablet server nodes>

 

#update your iptables

 

 

#env vars
ACCUMULO_HOME=/opt/accumulo/accumulo-1.5.0

JAVA_HOME=/usr/java/default (jdk7 in my last install worked fine)

 

Settings for accumulo-env.sh in /conf:

# cdh4

export HADOOP_HDFS_HOME=/opt/cloudera/parcels/CDH/lib/hadoop-hdfs

export HADOOP_MAPREDUCE_HOME=/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce

test -z "$HADOOP_CONF_DIR"       && export 
HADOOP_CONF_DIR="$HADOOP_PREFIX/etc/hadoop"

test -z "$JAVA_HOME"             && export JAVA_HOME=/usr/java/default

test -z "$ZOOKEEPER_HOME"        && export 
ZOOKEEPER_HOME=/opt/cloudera/parcels/CDH/lib/zookeeper

test -z "$ACCUMULO_LOG_DIR"      && export ACCUMULO_LOG_DIR=$ACCUMULO_HOME/logs

 

#update all files as appropriate in /opt/Accumulo/Accumulo-current/conf/*

masters, monitor,slaves,tracers,gc,Accumulo-site.xml, Accumulo-env.sh

 

#accumulo-site.xml

<property>

    <name>general.classpaths</name>

    <value>

      $ACCUMULO_HOME/server/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-server.jar,

      $ACCUMULO_HOME/core/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-core.jar,

      $ACCUMULO_HOME/start/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-start.jar,

      $ACCUMULO_HOME/fate/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-fate.jar,

      $ACCUMULO_HOME/proxy/target/classes/,

      $ACCUMULO_HOME/lib/accumulo-proxy.jar,

      $ACCUMULO_HOME/lib/[^.].*.jar,

      $ZOOKEEPER_HOME/zookeeper[^.].*.jar,

      $HADOOP_CONF_DIR,

      $HADOOP_PREFIX/[^.].*.jar,

      $HADOOP_PREFIX/lib/[^.].*.jar,

      $HADOOP_HDFS_HOME/.*.jar,

      $HADOOP_HDFS_HOME/lib/.*.jar,

      $HADOOP_MAPREDUCE_HOME/.*.jar,

      $HADOOP_MAPREDUCE_HOME/lib/.*.jar

    </value>

    <description>Classpaths that accumulo checks for updates and class files.

      When using the Security Manager, please remove the ".../target/classes/" 
values.

    </description>

  </property>

 

then of course, always run your Accumulo binaries/scripts using the HDFS 
account.  I’m sure I’m missing a few steps here and there… 

 

$ACCUMULO_HOME/bin/accumulo init

…

$ACCUMULO_HOME/bin/start-all.sh

 

 

 

From: [email protected] 
[mailto:[email protected]] On 
Behalf Of Sean Busbey
Sent: Thursday, January 16, 2014 2:20 PM
To: Accumulo User List
Subject: Re: accumulo startup issue: Accumulo not initialized, there is no 
instance id at /accumulo/instance_id

 

 

On Thu, Jan 16, 2014 at 1:14 PM, Kesten Broughton <[email protected]> wrote:

"You should make sure to correct the maximum number of open files for the user 
that is running Accumulo."

I have the following in all /etc/security/limits.conf in my accumulo cluster

hdfs soft nofile 65536
hdfs hard nofile 65536

However, i see this for all nodes.
WARN : Max files open on 10.0.11.208 is 32768, recommend 65536

Should it be a different user or something?

'the user that is running Accumulo'
sudo hdfs
hdfs$ bin/accumulo -u root

so is hdfs or root the accumulo user?

 

The user in question here is the one who starts the Accumulo server processes. 
In production environments this should be a user dedicated to Accumulo. FWIW, I 
usually name this user "accumulo".

 

How do you start up Accumulo? a service script? running 
$ACCUMULO_HOME/bin/start-all.sh? something else?

RE: accumulo startup issue: Accumulo not initialized, there is no instance id at /accumulo/instance_id

Reply via email to