yes

There is very little validation of events done by PredictionIO since they are 
Template specific and the EventSever is not. Any usage event that does not have 
"entityType": “user” is ignored so ti the UR you have not data.

Also user properties encoded in this way are ignored. User data should be 
encoded as some kind of preference indicator and sent as a named usage event. 
Location can be used as a preference indicator but I’d choose one type of 
granularity like postal code, and it needs special setup since the downsampling 
thresholds assume you will have many possible values for any indicator. This 
can be configured but you have to ask yourself; “do I really think user 
location is going to be important"


On Mar 28, 2017, at 11:05 AM, Haddix, Steven <[email protected]> wrote:

SOLVED: It appears universal recommendation engine requires the event entity 
types to equal “user”.

From: Microsoft Office User <[email protected] 
<mailto:[email protected]>>
Reply-To: <[email protected] 
<mailto:[email protected]>>
Date: Tue, 28 Mar 2017 16:38:45 +0000
To: "[email protected] 
<mailto:[email protected]>" 
<[email protected] 
<mailto:[email protected]>>
Subject: pio train error java.lang.NegativeArraySizeException

I know this has been posted a few times, but I can't seem to get around this 
error. I’m not sure what I’m missing but any help is appreciated.

I’m using the Universal Recommendation Engine

It appears I have data loaded with event an type that matches the engine.json.

When training my data I get the following error:

Error

[WARN] [TaskSetManager] Lost task 0.0 in stage 13.0 (TID 9, localhost): 
java.lang.NegativeArraySizeException
        at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:57)
        at 
org.apache.mahout.sparkbindings.SparkEngine$$anonfun$5.apply(SparkEngine.scala:78)
        at 
org.apache.mahout.sparkbindings.SparkEngine$$anonfun$5.apply(SparkEngine.scala:77)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

My engine.json is:
 {
  "comment":" This config file uses default settings for all but the required 
values see README.md for docs",
  "id": "default",
  "description": "Default settings",
  "engineFactory": "com.wendys.RecommendationEngine",
  "datasource": {
    "params" : {
      "name": "sample-handmade-data.txt",
      "appName": "Ordering",
      "eventNames": ["buy"]
    }
  },
  "sparkConf": {
    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
    "spark.kryo.registrator": 
"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
    "spark.kryo.referenceTracking": "false",
    "spark.kryoserializer.buffer": "300m",
    "spark.executor.memory": "4g",
    "es.index.auto.create": "true"
  },
  "algorithms": [
    {
      "comment": "simplest setup where all values are default, popularity based 
backfill, must add eventsNames",
      "name": "ur",
      "params": {
        "appName": "Ordering",
        "indexName": "urindex",
        "typeName": "items",
        "comment": "must have data for the first event or the model will not 
build, other events are optional",
        "eventNames": ["buy"]
      }
    }
  ]
}

Sample event from "pio export --appid 3 --format json"
{
  "eventId": "__RjfodLTziIR9c6tFXyCwAAAVecf1DgqYK6F_BBYGQ",
  "event": "buy",
  "entityType": "customer",
  "entityId": "3927685",
  "targetEntityType": "item",
  "targetEntityId": "1",
  "properties": {
    "city": [
      "<removed>"
    ],
    "state": [
      "<removed>"
    ],
    "zip": [
      <removed>
    ],
    "country": [
      "US"
    ]
  },
  "eventTime": "2016-10-07T00:16:12.000Z",
  "creationTime": "2017-03-28T14:20:14.177Z"
}

Pio status result

[INFO] [Console$] Inspecting PredictionIO… 
[INFO] [Console$] PredictionIO 0.9.6 is installed at /PredictionIO  
[INFO] [Console$] Inspecting Apache Spark…  
[INFO] [Console$] Apache Spark is installed at 
/PredictionIO/vendors/spark-1.6.2-bin-hadoop2.6  
[INFO] [Console$] Apache Spark 1.6.2 detected (meets minimum requirement of 
1.3.0) 
[INFO] [Console$] Inspecting storage backend connections…  
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)… 
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)…  
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)…  
[INFO] [Storage$] Test writing to Event Store (App Id 0)…  
[INFO] [HBLEvents] The table pio_event:events_0 doesnt exist yet. Creating now… 
 
[INFO] [HBLEvents] Removing table pio_event:events_0…  
[INFO] [Console$] (sleeping 5 seconds for all messages to show up…)  
[INFO] [Console$] Your system is all ready to go.


Notice: This e-mail message and its attachments are the property of The Wendy's 
Company or one of its subsidiaries and may contain confidential or legally 
privileged information intended solely for the use of the addressee(s). If you 
are not an intended recipient, then any use, copying or distribution of this 
message or its attachments is strictly prohibited. If you received this message 
in error, please notify the sender and delete this message entirely from your 
system.
Notice: This e-mail message and its attachments are the property of The Wendy's 
Company or one of its subsidiaries and may contain confidential or legally 
privileged information intended solely for the use of the addressee(s). If you 
are not an intended recipient, then any use, copying or distribution of this 
message or its attachments is strictly prohibited. If you received this message 
in error, please notify the sender and delete this message entirely from your 
system.

Reply via email to