Repository: kylin Updated Branches: refs/heads/document b96a01e5a -> 84c961a39
add post kylin guide for aws emr Signed-off-by: lidongsjtu <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/kylin/repo Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/84c961a3 Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/84c961a3 Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/84c961a3 Branch: refs/heads/document Commit: 84c961a3979fd2552c2da11553c06249f16bd2ec Parents: b96a01e Author: Roger Shi <[email protected]> Authored: Tue May 24 16:34:34 2016 +0800 Committer: lidongsjtu <[email protected]> Committed: Tue May 24 16:37:16 2016 +0800 ---------------------------------------------------------------------- website/_posts/blog/2016-05-24-aws-emr.md | 78 +++++++++++++++++++++++++ website/images/blog/aws_emr_console.png | Bin 0 -> 62597 bytes 2 files changed, 78 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/kylin/blob/84c961a3/website/_posts/blog/2016-05-24-aws-emr.md ---------------------------------------------------------------------- diff --git a/website/_posts/blog/2016-05-24-aws-emr.md b/website/_posts/blog/2016-05-24-aws-emr.md new file mode 100644 index 0000000..cdfae19 --- /dev/null +++ b/website/_posts/blog/2016-05-24-aws-emr.md @@ -0,0 +1,78 @@ +--- +layout: post-blog +title: Apache Kylin Guide for AWS EMR User +date: 2016-05-24 11:00:00 +author: Roger Shi +categories: blog +--- + +# Install Apache Kylin on EMR # +1. Create an EMR cluster from AWS console, and remember to pick the applications configuration which contains HBase and Hive as shown here. + + +2. Login to master node of the cluster with ssh. ([instruction](http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-connect-master-node-ssh.html)) + +3. Download and install Kylin binary, we take Kylin version 1.5.1 as example here (Please refer to [Kylin download page](http://kylin.apache.org/download/) for latest binary package). Run the commands below. + +``` +wget https://dist.apache.org/repos/dist/release/kylin/apache-kylin-1.5.1/apache-kylin-1.5.1-HBase1.1.3-bin.tar.gz +tar xzf apache-kylin-1.5.1-HBase1.1.3-bin.tar.gz +cd apache-kylin-1.5.1-bin +export KYLIN_HOME=`pwd` +``` + +# Configure Apache Kylin + +To make Kylin run well on AWS EMR we need to change some scripts. The fix can be found in this [commit](https://github.com/apache/kylin/commit/dc08186d570e16b37d9ddaab80aba28801cdb3d0) and it'll be available in Kylin's future version. + +*First file is $KYLIN_HOME/bin/find-hive-dependency.sh, starting from line 68. + +``` + 68 if [ -z "$HCAT_HOME" ] + 69 then + 70 echo "HCAT_HOME not found, try to find hcatalog path from hadoop home" + 71 hadoop_home=`echo $hive_exec_path | awk -F '/hive.*/lib/' '{print $1}'` + + is_aws=`uname -r | grep amzn` + 72 if [ -d "${hadoop_home}/hive-hcatalog" ]; then + 73 hcatalog_home=${hadoop_home}/hive-hcatalog + 74 elif [ -d "${hadoop_home}/hive/hcatalog" ]; then + 75 hcatalog_home=${hadoop_home}/hive/hcatalog + + elif [ -n is_aws ] && [ -d "/usr/lib/oozie/lib" ]; then + + hcatalog_home=/usr/lib/oozie/lib + 76 else + 77 echo "Couldn't locate hcatalog installation, please make sure it is installed and set HCAT_HOME to the path." + 78 exit 1 + 79 fi + 80 else + 81 echo "HCAT_HOME is set to: $HCAT_HOME, use it to find hcatalog path:" + 82 hcatalog_home=${HCAT_HOME} + 83 fi +``` + +When Kylin starts running it tries to detect environment such as Hive home dictionary. It goes well most time but not for AWS EMR. In AWS EMR nodes, Hcatalog libs is under "/usr/lib/oozie/lib", which is not expected. The modification makes Kylin handle it specially once detecting it's running on AWS EMR. + +*Second file is $KYLIN_HOME/bin/find-hbase-dependency.sh, starting from line 20. + +``` + 20 hbase_classpath=`hbase classpath` + + + + is_aws=`uname -r | grep amzn` + + if [ -n is_aws ] && [ -d "/usr/lib/oozie/lib" ]; then + + export HBASE_ENV_INIT="true" + + fi + + + 21 arr=(`echo $hbase_classpath | cut -d ":" --output-delimiter=" " -f 1-`) +``` + +In AWS EMR, Hbase scripts reset environment variable HBASE_CLASSPATH in its first run, which is determined by environment variable HBASE_ENV_INIT. Kylin builds its own class path according to this variable. To avoid missing libs caused by HBASE_CLASSPATH reseting, Kylin set HBASE_ENV_INIT to "true" after running "hbase classpath" so that HBASE_CLASSPATH won't be reset next time run "hbase classpath". + +# Load sample data and start Apache Kylin + +Run following commands to load sample data and start Kylin service. + +``` +$KYLIN_HOME/bin/sample.sh +$KYLIN_HOME/bin/kylin.sh start +``` + +Now you can play with Kylin. For further guide, please refer to [Kylin official doc](http://kylin.apache.org/docs15/). \ No newline at end of file http://git-wip-us.apache.org/repos/asf/kylin/blob/84c961a3/website/images/blog/aws_emr_console.png ---------------------------------------------------------------------- diff --git a/website/images/blog/aws_emr_console.png b/website/images/blog/aws_emr_console.png new file mode 100644 index 0000000..457e44b Binary files /dev/null and b/website/images/blog/aws_emr_console.png differ
