Cross validation for item-based recs is problematic and of dubious value. I’d 
A/B test changes by starting from default and gong from there. 


On May 1, 2017, at 8:34 AM, Dennis Honders <[email protected]> wrote:

Hi,

I'm currently working on an Evaluator for the Similar product template. I'm not 
a Scala expert. 

I followed the Classification Quickstart which is used for the Evaluator 
tutorial. http://predictionio.incubator.apache.org/evaluation/paramtuning/ 
<http://predictionio.incubator.apache.org/evaluation/paramtuning/>

A RDD of LabeldPoint is used to retrieve data. 
val labeledPoints: RDD[LabeledPoint] = eventsDb.aggregateProperties(...
The Similar product template retrieves like: 
val usersRDD: RDD[(String, User)] = PEventStore.aggregateProperties(...
val itemsRDD: RDD[(String, Item)] = PEventStore.aggregateProperties(
val viewEventsRDD: RDD[ViewEvent] = PEventStore.find(
According to the docs, the above should be the same?

}.cache()
    // End of reading from data store

    // K-fold splitting
    val evalK = dsp.evalK.get
    val indexedPointsUsers: RDD[((String, User), Long)] = 
usersRDD.zipWithIndex()
    val indexedPointsItems: RDD[((String, Item), Long)] = 
itemsRDD.zipWithIndex()
    val indexedPointsView: RDD[(ViewEvent, Long)] = viewEventsRDD.zipWithIndex()

    (0 until evalK).map { idx =>
      val trainingPointsUsers = indexedPointsUsers.filter(_._2 % evalK != 
idx).map(_._1)
      val testingPointsUsers = indexedPointsUsers.filter(_._2 % evalK == 
idx).map(_._1)

      val trainingPointsItems = indexedPointsItems.filter(_._2 % evalK != 
idx).map(_._1)
      val testingPointsItems = indexedPointsItems.filter(_._2 % evalK == 
idx).map(_._1)

      val trainingPointsView = indexedPointsView.filter(_._2 % evalK != 
idx).map(_._1)
      val testingPointsView = indexedPointsView.filter(_._2 % evalK == 
idx).map(_._1)

      (
        new TrainingData(trainingPointsUsers, trainingPointsItems, 
trainingPointsView),
        new EmptyEvaluationInfo(),
        testingPointsUsers.map {
          p => (new Query(p.features(0), p.features(1), p.features(2)), new 
ActualResult(p.label))
        }
      )
    }

class TrainingData(val users: RDD[(String, User)], val items: RDD[(String, 
Item)], val viewEvents: RDD[ViewEvent]) extends Serializable {
  override def toString = {
      s"users: [${users.count()} (${users.take(2).toList}...)]" +
      s"items: [${items.count()} (${items.take(2).toList}...)]" +
      s"viewEvents: [${viewEvents.count()}] (${viewEvents.take(2).toList}...)"
  }
}

What happens at this part? The red marks correspondend to 'Cannot resolve 
symbol *'
 testingPointsUsers.map {
          p => (new Query(p.features(0), p.features(1), p.features(2)), new 
ActualResult(p.label))
        }



Thanks in advance, 

Dennis

Reply via email to