Hi Guys,

I tried to set up a Predictionio cluster, working with an elasticsearch and hadoop clusters. 

The fact is, everything is working well (Hbase, ES or Hadoop/Spark), and as soon as I tried « pio-start-all » I am getting these errors:

aml@master:~$ pio status
[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.12.0-incubating is installed at /home/aml
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at 
/opt/spark/spark-1.6.3-bin-hadoop2.6
[INFO] [Management$] Apache Spark 1.6.3 detected (meets minimum requirement of 
1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: HDFS)...
[ERROR] [Storage$] Error initializing storage client for source HDFS.
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://master.c.choose-ninja-01.internal:9000/models, expected: file:
///
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:649)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.setWorkingDirectory(RawLocalFileSystem.java:547)
        at 
org.apache.hadoop.fs.FilterFileSystem.setWorkingDirectory(FilterFileSystem.java:280)
        at 
org.apache.predictionio.data.storage.hdfs.StorageClient.<init>(StorageClient.scala:33)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:252)
        at 
org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(S
torage.scala:283)
        at 
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
        at 
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
        at 
scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
        at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
        at 
org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
        at 
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
        at 
org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
        at 
org.apache.predictionio.data.storage.Storage$.getModelDataModels(Storage.scala:442)
        at 
org.apache.predictionio.data.storage.Storage$.verifyAllDataObjects(Storage.scala:381)
        at 
org.apache.predictionio.tools.commands.Management$.status(Management.scala:156)
        at org.apache.predictionio.tools.console.Pio$.status(Pio.scala:155)
        at 
org.apache.predictionio.tools.console.Console$$anonfun$main$1.apply(Console.scala:721)
        at 
org.apache.predictionio.tools.console.Console$$anonfun$main$1.apply(Console.scala:656)
        at scala.Option.map(Option.scala:146)
        at 
org.apache.predictionio.tools.console.Console$.main(Console.scala:656)
        at org.apache.predictionio.tools.console.Console.main(Console.scala)

[ERROR] [Management$] Unable to connect to all storage backends successfully.
The following shows the error message from the storage backend.
Data source HDFS was not properly initialized. 
(org.apache.predictionio.data.storage.StorageClientException)
Dumping configuration of initialized storage backend sources.
Please make sure they are correct.
Source Name: ELASTICSEARCH; Type: elasticsearch; Configuration: TYPE -> 
elasticsearch, HOME -> /opt/elasticsearch/e
lasticsearch-5.5.2
Source Name: HDFS; Type: (error); Configuration: (error


And then, my hadoop is not working anymore :

hdfs dfs -mkdir test
mkdir: Call From master.c.choose-ninja-01.internal/10.128.0.4 to master.c.choose-ninja-01.internal:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

I tried working with a gcloud dataproc, or following your tutorial http://actionml.com/docs/small_ha_cluster with some vm engines (still on gcloud), but the results are the same.. Any Idea ? We tried so many times that I am beginning thinking that we cannot use predictionio with an hadoop and elasticsearch clusters :/

Thanks for your help,
Regards

Thibaut




Thibaut Gensollen

Reply via email to