[GitHub] [spark] adjordan edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel
adjordan edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658579068 Yes, I know the difference between the two. I just assumed that `MLUtils.kFold` was doing the splits according to the k-fold method, given then name, and not the random sub-sampling method. But I suppose changing the name of that method is outside the scope of what I'm trying to add. In that case, it seems that I should add an additional `method` parameter where you can select k-fold or random sub-sampling. If I end up doing that, should I continue with this PR or open a new one? Thoughts @viirya? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] adjordan edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel
adjordan edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658579068 Yes, I know the difference between the two. I just assumed that `MLUtils.kFold` was doing the splits according to the k-fold method, given then name, and not the random sub-sampling method. But I suppose changing the name of that method is outside the scope of what I'm trying to add. In that case, it seems that I should add an addition `method` parameter where you can select k-fold or random sub-sampling. If I end up doing that, should I continue with this PR or open a new one? Thoughts @viirya? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] adjordan edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel
adjordan edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658547236 @viirya Sorry, can you explain? I don't see how it changes the technique, it just allows models from multiple folds to be run in parallel. `MLUtils.kFold` is doing k-fold cross validation, not repeated random sub-sampling validation, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] adjordan edited a comment on pull request #29080: [SPARK-32271][ML] Update CrossValidator to train folds in parallel
adjordan edited a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658547236 @viirya Sorry, can you explain? I don't see how it changes the technique, it just allows models from multiple folds to be run in parallel. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org