I'm attempting to use a remote cluster with PIO 0.12.1.  When I run
pio-start-all it starts the hbase locally and does not use the remote
cluster as configured.  I've copied the HBase and Hadoop conf files from
the cluster and put them into the locally configured directories.  I set
this up in the past using a similar configuration but was using PIO
0.10.0.  When doing this with this version I could start pio with only the
hbase and hadoop conf present.  This does not seem to be the case any
longer.

If I only put the cluster configs then it complains that it cannot find
start-hbase.sh.  If I put a hbase installation with cluster configs then it
will start a local hbase and not use the remote cluster.

Below is my PIO configuration

########

#!/usr/bin/env bash
#
# Safe config that will work if you expand your cluster later
SPARK_HOME=$PIO_HOME/vendors/spark
ES_CONF_DIR=$PIO_HOME/vendors/elasticsearch
HADOOP_CONF_DIR=$PIO_HOME/vendors/hadoop/conf
HBASE_CONF_DIR==$PIO_HOME/vendors/hbase/conf


# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=$HOME/.pio_store
PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

# PredictionIO Storage Configuration
PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

# Need to use HDFS here instead of LOCALFS to enable deploying to
# machines without the local model
PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=HDFS

# What store to use for what data
# Elasticsearch Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch
# The next line should match the ES cluster.name in ES config
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=dsp_es_cluster

# For clustered Elasticsearch (use one host/port if not clustered)
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=ip-10-0-1-136.us-gov-west-1.compute.internal,ip-10-0-1-126.us-gov-west-1.compute.internal,ip-10-0-1-126.us-gov-west-1.compute.internal
#PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,9300,9300
#PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
# PIO 0.12.0+ uses the REST client for ES 5+ and this defaults to
# port 9200, change if appropriate but do not use the Transport Client port
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200,9200,9200

PIO_STORAGE_SOURCES_HDFS_TYPE=hdfs
PIO_STORAGE_SOURCES_HDFS_PATH=hdfs://ip-10-0-1-138.us-gov-west-1.compute.internal:8020/models

# HBase Source config
PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase

# Hbase clustered config (use one host/port if not clustered)
PIO_STORAGE_SOURCES_HBASE_HOSTS=ip-10-0-1-138.us-gov-west-1.compute.internal,ip-10-0-1-209.us-gov-west-1.compute.internal,ip-10-0-1-79.us-gov-west-1.compute.internal
~

Reply via email to