viirya commented on a change in pull request #25789: [SPARK-28927][ML] Show 
warning when input data to ALS is indeterminate
URL: https://github.com/apache/spark/pull/25789#discussion_r324478452
 
 

 ##########
 File path: R/pkg/R/mllib_recommendation.R
 ##########
 @@ -82,6 +82,10 @@ setClass("ALSModel", representation(jobj = "jobj"))
 #' statsS <- summary(modelS)
 #' }
 #' @note spark.als since 2.1.0
+#' @note the input rating dataframe to the ALS implementation should not be 
indeterminate.
 
 Review comment:
   I think checkpoint is relatively reliable. In case of checkpoint loss, Spark 
job fails without rerun. So you should not get an inconsistent data once you do 
checkpoint.
   
   We have two ways to fix it, one is checkpoint, another is to sort data 
before sample/randomSplit. I added into the updated note.
   
   Sounds like targeting a specific problem here is better. I do the catching 
AIOOBE thing and remove the warning as it seems not too much useful.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to