Hi all, I’ve followed carefully the instructions provided in http://kylin.apache.org/docs23/install/kylin_aws_emr.html <http://kylin.apache.org/docs23/install/kylin_aws_emr.html>
My idea is to use s3 as the storage for Hbase, I have configured the cluster following the instructions but I get that tables that contain cube definition keep "on transition" when deploying new cluster and Kylie metadata seems outdated... These are the steps I follow to create the cluster Cluster creation command: aws emr create-cluster \ --applications Name=Hadoop Name=Hue Name=Spark Name=Zeppelin Name=Ganglia Name=Hive Name=Hbase Name=HCatalog Name=Tez \ --tags 'hive=' 'spark=' 'zeppelin=' \ --ec2-attributes 'file://../config/ec2-attributes.json <file://../config/ec2-attributes.json>' \ --release-label emr-5.16.0 \ --log-uri 's3n://sns-da-logs/ <s3n://sns-da-logs/>' \ --instance-groups 'file://../config/instance-hive-datawarehouse.json <file://../config/instance-hive-datawarehouse.json>' \ --configurations 'file://../config/hive-hbase-s3.json <file://../config/hive-hbase-s3.json>' \ --auto-scaling-role EMR_AutoScaling_DefaultRole \ --ebs-root-volume-size 10 \ --service-role EMR_DefaultRole \ --enable-debugging \ --name 'hbase-hive-datawarehouse' \ --scale-down-behavior TERMINATE_AT_TASK_COMPLETION \ --region us-east-1 My configuration hive-hbase-s3.json: [ { "Classification": "hive-site", "Configurations": [], "Properties": { "hive.metastore.warehouse.dir": "s3://xxxxxxxx-datawarehouse/hive.db <s3://xxxxxxxx-datawarehouse/hive.db>", "javax.jdo.option.ConnectionDriverName": "org.mariadb.jdbc.Driver", "javax.jdo.option.ConnectionPassword": “xxxxx", "javax.jdo.option.ConnectionURL": "jdbc:mysql://xxxxxx:3306/hive_metastore?createDatabaseIfNotExist=true <mysql://xxxxxx:3306/hive_metastore?createDatabaseIfNotExist=true>", "javax.jdo.option.ConnectionUserName": “xxxx" } }, { "Classification": "hbase", "Configurations": [], "Properties": { "hbase.emr.storageMode": "s3" } }, { "Classification": "hbase-site", "Configurations": [], "Properties": { "hbase.rpc.timeout": "3600000", "hbase.rootdir": "s3://xxxxxx-hbase/ <s3://xxxxxx-hbase/>" } }, { "Classification": "core-site", "Properties": { "io.file.buffer.size": "65536" } }, { "Classification": "mapred-site", "Properties": { "mapred.map.tasks.speculative.execution": "false", "mapred.reduce.tasks.speculative.execution": "false", "mapreduce.map.speculative": "false", "mapreduce.reduce.speculative": "false" } } ] When I shut down the cluster I perform these commands: ../kylin_home/bin/kylin.sh stop #Before you shutdown/restart the cluster, you must backup the “/kylin” data on HDFS to S3 with S3DistCp, aws s3 rm s3://xxxxxx-config/metadata/kylin/* <s3://xxxxxx-config/metadata/kylin/*> s3-dist-cp --src=hdfs:///kylin <hdfs:///kylin> --dest=s3://xxxxxx-config/metadata/kylin <s3://xxxxxx-config/metadata/kylin> bash /usr/lib/hbase/bin/disable_all_tables.sh Please, could you be so kind to indicate me what am I missing Thanks in advance