The UR uses the REST API for most operations. The servers should be listed in 
the engine.json for this in the sparkConf section. Like this:

    "sparkConf": {
      "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
      "spark.kryo.registrator": 
"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
      "spark.kryo.referenceTracking": "false",
      "spark.kryoserializer.buffer": "300m",
      "es.index.auto.create": “true”,
      "es.nodes": “ip-xxx.ec2.internal,ip-xxx.ec2.internal,xxx.ec2.internal",
      “es.username”: “some-user”,
      “es.password”: “some-passowrd
    },

the ES nodes should be IPs or dns names that are reachable from your PIO 
machine. BTW I may have the uname/pword param wrong so check the ES docs.

This is how the REST client needs to be configured, not using pio-env.sh. PIO 
itself never uses the REST client, only the UR, and we will shortly move 
PredictionIO to completely using the REST API. In this case the config will be 
moved the pio-env.sh. For now use the above to change anything needed by the 
Elaticsearch-Hadoop library, s\which writes the model to ES, and the REST API, 
which create indexes and aliases.

We use v1.7.6 and it works fine. 

Also make sure to go through the entire workflow after engine.json changes: 
build, train, deploy



On Jan 17, 2017, at 7:16 AM, Ashutosh Banerjee <[email protected]> wrote:

Hi,

We are facing an intermittent issue while using the universal recommender 
template on our production servers. Every once in a while we get an error that 
none of the ElasticSearch nodes are reachable 
(org.elasticsearch.client.transport.NoNodeAvailableException: None of the 
configured nodes are available: [])

Our elasticsearch cluster is hosted on a elasticsearch hosting provider(Qbox) 
and at the time of receiving this error all the shards were green and reachable 
from the server where predictio-io template is hosted. There was no shortage of 
RAM or disk space on that server when this issue occurred.
The issue gets resolved on performing the following steps:
1) pio undeploy 
2) pio-stop-all
3) pio-start-all
4) pio deploy

Pio version used is 0.10.0-incubating and universal recommender version is 0.5.0
We did face this issue earlier as well and could not figure out why the ES 
nodes became unreachable after a particular period.
The pio-env config for ES is below:

PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<cluster-name>
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=<hosted-es-domain>
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=<port> // Transport Layer Port Used Here
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=<local-es-path>

We are currently using ElasticSearch version 1.5.2 on our hosted servers as 
version 1.4.0 used by the pio is not supported on the hosted service. We are 
using the Transport Layer port to communicate with the ES nodes as opposed to 
calling the https endpoint.

Please help us figure out the underlying cause for this issue.

Thanks,
Ashutosh



Reply via email to