Yes, those instructions tell you to run HDFS in pseudo-cluster mode. What
do you see in the HDFS GUI on localhost:50070 ?

Those setup instructions create a pseudo-clustered Spark, and HDFS/HBase.
This runs on a single machine but as the page says, are configured so you
can easily expand to a cluster by replacing config to point to remote HDFS
or Spark clusters.

One fix, if you don’t want to run those services in pseudo-cluster mode is:

1) remove any mention of PGSQL or jdbc, we are not using it. These are not
found on the page you linked to and are not used.
2) on a single machine you can put the dummy/empty model file in LOCALFS so
change the lines
    PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=HDFS
    PIO_STORAGE_SOURCES_HDFS_PATH=hdfs://localhost:9000/models
to
    PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE= LOCALFS
    PIO_STORAGE_SOURCES_HDFS_PATH=/path/to/models
substituting with a directory where you want to save models

Running them in a pseudo-cluster mode gives you GUIs to see job progress
and browse HDFS for files, among other things. We recommend it for helping
to debug problems when you get to large amounts of data and begin running
out of resources.


From: Anuj Kumar <anuj.ku...@timesinternet.in> <anuj.ku...@timesinternet.in>
Date: June 19, 2018 at 10:35:02 AM
To: p...@occamsmachete.com <p...@occamsmachete.com> <p...@occamsmachete.com>
Cc: user@predictionio.apache.org <user@predictionio.apache.org>
<user@predictionio.apache.org>, actionml-u...@googlegroups.com
<actionml-u...@googlegroups.com> <actionml-u...@googlegroups.com>
Subject:  Re: java.util.NoSuchElementException: head of empty list when
running train

Hi Pat,
          Read it on the below link

http://actionml.com/docs/single_machine

here is the pio-env.sh

SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6

POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar

MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar

HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

HBASE_CONF_DIR=/usr/local/hbase/conf

PIO_FS_BASEDIR=$HOME/.pio_store

PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines

PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta

PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event

PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model

PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=HDFS

PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc

PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio

PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio

PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio

PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch

PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/usr/local/els

PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=pio

PIO_STORAGE_SOURCES_HDFS_TYPE=hdfs

PIO_STORAGE_SOURCES_HDFS_PATH=hdfs://localhost:9000/models

PIO_STORAGE_SOURCES_HBASE_TYPE=hbase

PIO_STORAGE_SOURCES_HBASE_HOME=/usr/local/hbase

Thanks,
Anuj Kumar



On Tue, Jun 19, 2018 at 9:16 PM Pat Ferrel <p...@occamsmachete.com> wrote:

> Can you show me where on the AML site it says to store models in HDFS, it
> should not say that? I think that may be from the PIO site so you should
> ignore it.
>
> Can you share your pio-env? You need to go through the whole workflow from
> pio build, pio train, to pio deploy using a template from the same
> directory and with the same engine.json and pio-env and I suspect something
> is wrong in pio-env.
>
>
> From: Anuj Kumar <anuj.ku...@timesinternet.in>
> <anuj.ku...@timesinternet.in>
> Date: June 19, 2018 at 1:28:11 AM
> To: p...@occamsmachete.com <p...@occamsmachete.com> <p...@occamsmachete.com>
> Cc: user@predictionio.apache.org <user@predictionio.apache.org>
> <user@predictionio.apache.org>, actionml-u...@googlegroups.com
> <actionml-u...@googlegroups.com> <actionml-u...@googlegroups.com>
> Subject:  Re: java.util.NoSuchElementException: head of empty list when
> running train
>
> Tried with basic engine.json mentioned at UL site examples. Seems to work
> but got stuck at "pio deploy" throwing following error
>
> [ERROR] [OneForOneStrategy] Failed to invert: [B@35c7052
>
>
> before that "pio train" was successful but gave following error. I suspect
> because of this reason "pio deploy" is not working. Please help
>
> [ERROR] [HDFSModels] File /models/pio_modelAWQXIr4APcDlNQi8DwVj could only
> be replicated to 0 nodes instead of minReplication (=1).  There are 0
> datanode(s) running and no node(s) are excluded in this operation.
>
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1726)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265)
>
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2565)
>
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:829)
>
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:510)
>
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:850)
>
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:793)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2489)
>
>
> On Tue, Jun 19, 2018 at 10:45 AM Anuj Kumar <anuj.ku...@timesinternet.in>
> wrote:
>
>> Sure, here it is.
>>
>> {
>>
>>   "comment":" This config file uses default settings for all but the
>> required values see README.md for docs",
>>
>>   "id": "default",
>>
>>   "description": "Default settings",
>>
>>   "engineFactory": "com.actionml.RecommendationEngine",
>>
>>   "datasource": {
>>
>>     "params" : {
>>
>>       "name": "sample-handmad",
>>
>>       "appName": "np",
>>
>>       "eventNames": ["read", "search", "view", "category-pref"],
>>
>>       "minEventsPerUser": 1,
>>
>>       "eventWindow": {
>>
>>         "duration": "300 days",
>>
>>         "removeDuplicates": true,
>>
>>         "compressProperties": true
>>
>>       }
>>
>>     }
>>
>>   },
>>
>>   "sparkConf": {
>>
>>     "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
>>
>>     "spark.kryo.registrator":
>> "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
>>
>>     "spark.kryo.referenceTracking": "false",
>>
>>     "spark.kryoserializer.buffer": "300m",
>>
>>     "spark.executor.memory": "4g",
>>
>>     "spark.executor.cores": "2",
>>
>>     "spark.task.cpus": "2",
>>
>>     "spark.default.parallelism": "16",
>>
>>     "es.index.auto.create": "true"
>>
>>   },
>>
>>   "algorithms": [
>>
>>     {
>>
>>       "comment": "simplest setup where all values are default,
>> popularity based backfill, must add eventsNames",
>>
>>       "name": "ur",
>>
>>       "params": {
>>
>>         "appName": "np",
>>
>>         "indexName": "np",
>>
>>         "typeName": "items",
>>
>>         "blacklistEvents": [],
>>
>>         "comment": "must have data for the first event or the model will
>> not build, other events are optional",
>>
>>         "indicators": [
>>
>>           {
>>
>>             "name": "read"
>>
>>           },{
>>
>>             "name": "search",
>>
>>             "maxCorrelatorsPerItem": 5
>>
>>           },{
>>
>>             "name": "category-pref",
>>
>>             "maxCorrelatorsPerItem": 50
>>
>>           },{
>>
>>             "name": "view",
>>
>>             "maxCorrelatorsPerItem": 50
>>
>>           }
>>
>>         ],
>>
>>         "expireDateName": "itemExpiry",
>>
>>         "dateName": "date",
>>
>>         "num": 5
>>
>>       }
>>
>>     }
>>
>>   ]
>>
>> }
>>
>>
>> On Mon, Jun 18, 2018 at 8:55 PM Pat Ferrel <p...@occamsmachete.com> wrote:
>>
>>> This sounds like some missing required config in engine.json. Can you
>>> share the file?
>>>
>>>
>>> From: Anuj Kumar <anuj.ku...@timesinternet.in>
>>> <anuj.ku...@timesinternet.in>
>>> Reply: user@predictionio.apache.org <user@predictionio.apache.org>
>>> <user@predictionio.apache.org>
>>> Date: June 18, 2018 at 5:05:22 AM
>>> To: user@predictionio.apache.org <user@predictionio.apache.org>
>>> <user@predictionio.apache.org>
>>> Subject:  java.util.NoSuchElementException: head of empty list when
>>> running train
>>>
>>> Getting this while running "pio train". Please help
>>>
>>> Exception in thread "main" java.util.NoSuchElementException: head of
>>> empty list
>>>
>>> at scala.collection.immutable.Nil$.head(List.scala:420)
>>>
>>> at scala.collection.immutable.Nil$.head(List.scala:417)
>>>
>>> at
>>> org.apache.mahout.math.cf.SimilarityAnalysis$.crossOccurrenceDownsampled(SimilarityAnalysis.scala:177)
>>>
>>> at com.actionml.URAlgorithm.calcAll(URAlgorithm.scala:343)
>>>
>>> at com.actionml.URAlgorithm.train(URAlgorithm.scala:295)
>>>
>>> at com.actionml.URAlgorithm.train(URAlgorithm.scala:180)
>>>
>>> at
>>> org.apache.predictionio.controller.P2LAlgorithm.trainBase(P2LAlgorithm.scala:49)
>>>
>>> at
>>> org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690)
>>>
>>> at
>>> org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690)
>>>
>>> at
>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>>>
>>> at
>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>>>
>>> at scala.collection.immutable.List.foreach(List.scala:381)
>>>
>>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>>>
>>> at scala.collection.immutable.List.map(List.scala:285)
>>>
>>> at org.apache.predictionio.controller.Engine$.train(Engine.scala:690)
>>>
>>> at org.apache.predictionio.controller.Engine.train(Engine.scala:176)
>>>
>>> at
>>> org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
>>>
>>> at
>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)
>>>
>>> at
>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
>>>
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
>>>
>>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
>>>
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
>>>
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>>
>>> --
>>> -
>>> Best,
>>> Anuj Kumar
>>>
>>>
>>
>> --
>> -
>> Best,
>> Anuj Kumar
>>
>
>
> --
> -
> Best,
> Anuj Kumar
>
>

--
-
Best,
Anuj Kumar
--
You received this message because you are subscribed to the Google Groups
"actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to actionml-user+unsubscr...@googlegroups.com.
To post to this group, send email to actionml-u...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/actionml-user/CAN5v0zfsuiGHsqgVdtAgc0t8%3DopRTGg6WE7KPEhhkjfrPvWVeg%40mail.gmail.com
<https://groups.google.com/d/msgid/actionml-user/CAN5v0zfsuiGHsqgVdtAgc0t8%3DopRTGg6WE7KPEhhkjfrPvWVeg%40mail.gmail.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.

Reply via email to