I'm attempting to use a remote cluster with PIO 0.12.1. When I run pio-start-all it starts the hbase locally and does not use the remote cluster as configured. I've copied the HBase and Hadoop conf files from the cluster and put them into the locally configured directories. I set this up in the past using a similar configuration but was using PIO 0.10.0. When doing this with this version I could start pio with only the hbase and hadoop conf present. This does not seem to be the case any longer.
If I only put the cluster configs then it complains that it cannot find start-hbase.sh. If I put a hbase installation with cluster configs then it will start a local hbase and not use the remote cluster. Below is my PIO configuration ######## #!/usr/bin/env bash # # Safe config that will work if you expand your cluster later SPARK_HOME=$PIO_HOME/vendors/spark ES_CONF_DIR=$PIO_HOME/vendors/elasticsearch HADOOP_CONF_DIR=$PIO_HOME/vendors/hadoop/conf HBASE_CONF_DIR==$PIO_HOME/vendors/hbase/conf # Filesystem paths where PredictionIO uses as block storage. PIO_FS_BASEDIR=$HOME/.pio_store PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp # PredictionIO Storage Configuration PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE # Need to use HDFS here instead of LOCALFS to enable deploying to # machines without the local model PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=HDFS # What store to use for what data # Elasticsearch Example PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch # The next line should match the ES cluster.name in ES config PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=dsp_es_cluster # For clustered Elasticsearch (use one host/port if not clustered) PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=ip-10-0-1-136.us-gov-west-1.compute.internal,ip-10-0-1-126.us-gov-west-1.compute.internal,ip-10-0-1-126.us-gov-west-1.compute.internal #PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,9300,9300 #PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost # PIO 0.12.0+ uses the REST client for ES 5+ and this defaults to # port 9200, change if appropriate but do not use the Transport Client port # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200,9200,9200 PIO_STORAGE_SOURCES_HDFS_TYPE=hdfs PIO_STORAGE_SOURCES_HDFS_PATH=hdfs://ip-10-0-1-138.us-gov-west-1.compute.internal:8020/models # HBase Source config PIO_STORAGE_SOURCES_HBASE_TYPE=hbase PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase # Hbase clustered config (use one host/port if not clustered) PIO_STORAGE_SOURCES_HBASE_HOSTS=ip-10-0-1-138.us-gov-west-1.compute.internal,ip-10-0-1-209.us-gov-west-1.compute.internal,ip-10-0-1-79.us-gov-west-1.compute.internal ~
