Github user junyangq commented on a diff in the pull request:
https://github.com/apache/spark/pull/14384#discussion_r75004906
--- Diff: R/pkg/inst/tests/testthat/test_mllib.R ---
@@ -454,4 +454,61 @@ test_that("spark.survreg", {
}
})
+test_that("spark.als", {
+ # R code to reproduce the result.
+ #
+ #' data <- list(list(0, 0, 4.0), list(0, 1, 2.0), list(1, 1, 3.0),
list(1, 2, 4.0),
+ #' list(2, 1, 1.0), list(2, 2, 5.0))
+ #' df <- createDataFrame(data, c("user", "item", "rating"))
+ #' model <- spark.als(df, ratingCol = "rating", userCol = "user",
itemCol = "item",
+ #' rank = 10, maxIter = 5, seed = 0)
+ #' test <- createDataFrame(list(list(0, 2), list(1, 0), list(2, 0)),
c("user", "item"))
+ #' predict(model, test)
+ #
+ # -- output of 'predict(model, data)'
+ #
+ # user item prediction
--- End diff --
I think the usage exposed in this example has mostly been covered by the
existing examples. Anything specific in mind?
The algorithm does not guarantee non-negativeness unless specified in the
arguments. A short answer would be a low predicted rating, if the ratings in
the training data are all nonnegative. In fact, if no constraints put, the
range of the predicted rating could be all real numbers. An alternative way is
to use another function to map the value back to the desired region (e.g. 0-5).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]