[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

debasish83 Tue, 31 Mar 2015 15:29:39 -0700

Github user debasish83 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3098#discussion_r27529681
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
 ---
    @@ -103,13 +109,106 @@ class MatrixFactorizationModel private[mllib] (
         recommend(productFeatures.lookup(product).head, userFeatures, num)
           .map(t => Rating(t._1, product, t._2))
     
    +  /**
    +   * Recommends topK users/products.
    +   *
    +   * @param num how many users to return. The number returned may be less 
than this.
    +   * @return [Array[Rating]] objects, each of which contains a userID, the 
given productID and a
    +   *  "score" in the rating field. Each represents one recommended user, 
and they are sorted
    +   *  by score, decreasing. The first returned is the one predicted to be 
most strongly
    +   *  recommended to the product. The score is an opaque value that 
indicates how strongly
    +   *  recommended the user is.
    +   */
    +
    +  /**
    +   * Recommend topK products for all users
    +   */
    +  def recommendProductsForUsers(num: Int): RDD[(Int, Array[Rating])] = {
    +    val topK = userFeatures.map { x => (x._1, num) }
    +    recommendProductsForUsers(topK)
    +  }
    +
    +  /**
    +   * Recommend topK users for all products
    +   */
    +  def recommendUsersForProducts(num: Int): RDD[(Int, Array[Rating])] = {
    +    val topK = productFeatures.map { x => (x._1, num) }
    +    recommendUsersForProducts(topK)
    +  }
    +
    +  val ord = Ordering.by[Rating, Double](x => x.rating)
    +  case class FeatureTopK(feature: Vector, topK: Int)
    +
    +  /**
    +   * Recommend topK products for users in userTopK RDD
    +   */
    +  def recommendProductsForUsers(
    +    userTopK: RDD[(Int, Int)]): RDD[(Int, Array[Rating])] = {
    +    val userFeaturesTopK = userFeatures.join(userTopK).map {
    +      case (userId, (userFeature, topK)) =>
    +        (userId, FeatureTopK(Vectors.dense(userFeature), topK))
    +    }
    +    val productVectors = productFeatures.map {
    +      x => (x._1, Vectors.dense(x._2))
    +    }.collect
    +
    +    userFeaturesTopK.map {
    +      case (userId, userFeatureTopK) => {
    +        val predictions = productVectors.map {
    +          case (productId, productVector) =>
    +            Rating(userId, productId,
    +              BLAS.dot(userFeatureTopK.feature, productVector))
    --- End diff --
    
    I will bring in lot of level 3 BLAS in the next PR...I am writing the dgemv 
and dgemm versions for several of these APIs...For now I will add a TODO



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MLLIB] SPARK-4231, SPARK-3066: Add RankingMet...

Reply via email to