[38/51] [partial] spark-website git commit: Add 2.1.2 docs
http://git-wip-us.apache.org/repos/asf/spark-website/blob/a6155a89/site/docs/2.1.2/api/R/spark.gbt.html -- diff --git a/site/docs/2.1.2/api/R/spark.gbt.html b/site/docs/2.1.2/api/R/spark.gbt.html new file mode 100644 index 000..98b2b03 --- /dev/null +++ b/site/docs/2.1.2/api/R/spark.gbt.html @@ -0,0 +1,244 @@ +http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd;>http://www.w3.org/1999/xhtml;>R: Gradient Boosted Tree Model for Regression and Classification + + + +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;> +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"> +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"> +hljs.initHighlightingOnLoad(); + + +spark.gbt {SparkR}R Documentation + +Gradient Boosted Tree Model for Regression and Classification + +Description + +spark.gbt fits a Gradient Boosted Tree Regression model or Classification model on a +SparkDataFrame. Users can call summary to get a summary of the fitted +Gradient Boosted Tree model, predict to make predictions on new data, and +write.ml/read.ml to save/load fitted models. +For more details, see +http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-regression;> +GBT Regression and +http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-classifier;> +GBT Classification + + + +Usage + + +spark.gbt(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.gbt(data, formula, + type = c("regression", "classification"), maxDepth = 5, maxBins = 32, + maxIter = 20, stepSize = 0.1, lossType = NULL, seed = NULL, + subsamplingRate = 1, minInstancesPerNode = 1, minInfoGain = 0, + checkpointInterval = 10, maxMemoryInMB = 256, cacheNodeIds = FALSE) + +## S4 method for signature 'GBTRegressionModel' +predict(object, newData) + +## S4 method for signature 'GBTClassificationModel' +predict(object, newData) + +## S4 method for signature 'GBTRegressionModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'GBTClassificationModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'GBTRegressionModel' +summary(object) + +## S4 method for signature 'GBTClassificationModel' +summary(object) + +## S3 method for class 'summary.GBTRegressionModel' +print(x, ...) + +## S3 method for class 'summary.GBTClassificationModel' +print(x, ...) + + + +Arguments + + +data + +a SparkDataFrame for training. + +formula + +a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', ':', '+', and '-'. + +... + +additional arguments passed to the method. + +type + +type of model, one of regression or classification, to fit + +maxDepth + +Maximum depth of the tree (= 0). + +maxBins + +Maximum number of bins used for discretizing continuous features and for choosing +how to split on features at each node. More bins give higher granularity. Must be += 2 and = number of categories in any categorical feature. + +maxIter + +Param for maximum number of iterations (= 0). + +stepSize + +Param for Step size to be used for each iteration of optimization. + +lossType + +Loss function which GBT tries to minimize. +For classification, must be logistic. For regression, must be one of +squared (L2) and absolute (L1), default is squared. + +seed + +integer seed for random number generation. + +subsamplingRate + +Fraction of the training data used for learning each decision tree, in +range (0, 1]. + +minInstancesPerNode + +Minimum number of instances each child must have after split. If a +split causes the left or right child to have fewer than +minInstancesPerNode, the split will be discarded as invalid. Should be += 1. + +minInfoGain + +Minimum information gain for a split to be considered at a tree node. + +checkpointInterval + +Param for set checkpoint interval (= 1) or disable checkpoint (-1). + +maxMemoryInMB + +Maximum memory in MB allocated to histogram aggregation. + +cacheNodeIds + +If FALSE, the algorithm will pass trees to executors to match instances with +nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching +can speed up training of deeper trees. Users can set how often should the +cache be checkpointed or disable it by setting checkpointInterval. + +object + +A fitted Gradient Boosted Tree regression model or classification model. + +newData + +a SparkDataFrame for testing. + +path + +The directory where the model is saved. + +overwrite + +Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists. + +x + +summary object of Gradient Boosted Tree regression model or classification model +returned by summary. + + + + +Value + +spark.gbt returns a fitted Gradient Boosted Tree model. +
[38/51] [partial] spark-website git commit: Add 2.1.2 docs
http://git-wip-us.apache.org/repos/asf/spark-website/blob/a6d9cbde/site/docs/2.1.2/api/R/spark.gbt.html -- diff --git a/site/docs/2.1.2/api/R/spark.gbt.html b/site/docs/2.1.2/api/R/spark.gbt.html new file mode 100644 index 000..98b2b03 --- /dev/null +++ b/site/docs/2.1.2/api/R/spark.gbt.html @@ -0,0 +1,244 @@ +http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd;>http://www.w3.org/1999/xhtml;>R: Gradient Boosted Tree Model for Regression and Classification + + + +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;> +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"> +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"> +hljs.initHighlightingOnLoad(); + + +spark.gbt {SparkR}R Documentation + +Gradient Boosted Tree Model for Regression and Classification + +Description + +spark.gbt fits a Gradient Boosted Tree Regression model or Classification model on a +SparkDataFrame. Users can call summary to get a summary of the fitted +Gradient Boosted Tree model, predict to make predictions on new data, and +write.ml/read.ml to save/load fitted models. +For more details, see +http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-regression;> +GBT Regression and +http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-classifier;> +GBT Classification + + + +Usage + + +spark.gbt(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.gbt(data, formula, + type = c("regression", "classification"), maxDepth = 5, maxBins = 32, + maxIter = 20, stepSize = 0.1, lossType = NULL, seed = NULL, + subsamplingRate = 1, minInstancesPerNode = 1, minInfoGain = 0, + checkpointInterval = 10, maxMemoryInMB = 256, cacheNodeIds = FALSE) + +## S4 method for signature 'GBTRegressionModel' +predict(object, newData) + +## S4 method for signature 'GBTClassificationModel' +predict(object, newData) + +## S4 method for signature 'GBTRegressionModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'GBTClassificationModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'GBTRegressionModel' +summary(object) + +## S4 method for signature 'GBTClassificationModel' +summary(object) + +## S3 method for class 'summary.GBTRegressionModel' +print(x, ...) + +## S3 method for class 'summary.GBTClassificationModel' +print(x, ...) + + + +Arguments + + +data + +a SparkDataFrame for training. + +formula + +a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', ':', '+', and '-'. + +... + +additional arguments passed to the method. + +type + +type of model, one of regression or classification, to fit + +maxDepth + +Maximum depth of the tree (= 0). + +maxBins + +Maximum number of bins used for discretizing continuous features and for choosing +how to split on features at each node. More bins give higher granularity. Must be += 2 and = number of categories in any categorical feature. + +maxIter + +Param for maximum number of iterations (= 0). + +stepSize + +Param for Step size to be used for each iteration of optimization. + +lossType + +Loss function which GBT tries to minimize. +For classification, must be logistic. For regression, must be one of +squared (L2) and absolute (L1), default is squared. + +seed + +integer seed for random number generation. + +subsamplingRate + +Fraction of the training data used for learning each decision tree, in +range (0, 1]. + +minInstancesPerNode + +Minimum number of instances each child must have after split. If a +split causes the left or right child to have fewer than +minInstancesPerNode, the split will be discarded as invalid. Should be += 1. + +minInfoGain + +Minimum information gain for a split to be considered at a tree node. + +checkpointInterval + +Param for set checkpoint interval (= 1) or disable checkpoint (-1). + +maxMemoryInMB + +Maximum memory in MB allocated to histogram aggregation. + +cacheNodeIds + +If FALSE, the algorithm will pass trees to executors to match instances with +nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching +can speed up training of deeper trees. Users can set how often should the +cache be checkpointed or disable it by setting checkpointInterval. + +object + +A fitted Gradient Boosted Tree regression model or classification model. + +newData + +a SparkDataFrame for testing. + +path + +The directory where the model is saved. + +overwrite + +Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists. + +x + +summary object of Gradient Boosted Tree regression model or classification model +returned by summary. + + + + +Value + +spark.gbt returns a fitted Gradient Boosted Tree model. +
[38/51] [partial] spark-website git commit: Add 2.1.2 docs
http://git-wip-us.apache.org/repos/asf/spark-website/blob/0b563c84/site/docs/2.1.2/api/R/spark.gbt.html -- diff --git a/site/docs/2.1.2/api/R/spark.gbt.html b/site/docs/2.1.2/api/R/spark.gbt.html new file mode 100644 index 000..98b2b03 --- /dev/null +++ b/site/docs/2.1.2/api/R/spark.gbt.html @@ -0,0 +1,244 @@ +http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd;>http://www.w3.org/1999/xhtml;>R: Gradient Boosted Tree Model for Regression and Classification + + + +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;> +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"> +https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"> +hljs.initHighlightingOnLoad(); + + +spark.gbt {SparkR}R Documentation + +Gradient Boosted Tree Model for Regression and Classification + +Description + +spark.gbt fits a Gradient Boosted Tree Regression model or Classification model on a +SparkDataFrame. Users can call summary to get a summary of the fitted +Gradient Boosted Tree model, predict to make predictions on new data, and +write.ml/read.ml to save/load fitted models. +For more details, see +http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-regression;> +GBT Regression and +http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-classifier;> +GBT Classification + + + +Usage + + +spark.gbt(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.gbt(data, formula, + type = c("regression", "classification"), maxDepth = 5, maxBins = 32, + maxIter = 20, stepSize = 0.1, lossType = NULL, seed = NULL, + subsamplingRate = 1, minInstancesPerNode = 1, minInfoGain = 0, + checkpointInterval = 10, maxMemoryInMB = 256, cacheNodeIds = FALSE) + +## S4 method for signature 'GBTRegressionModel' +predict(object, newData) + +## S4 method for signature 'GBTClassificationModel' +predict(object, newData) + +## S4 method for signature 'GBTRegressionModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'GBTClassificationModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'GBTRegressionModel' +summary(object) + +## S4 method for signature 'GBTClassificationModel' +summary(object) + +## S3 method for class 'summary.GBTRegressionModel' +print(x, ...) + +## S3 method for class 'summary.GBTClassificationModel' +print(x, ...) + + + +Arguments + + +data + +a SparkDataFrame for training. + +formula + +a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', ':', '+', and '-'. + +... + +additional arguments passed to the method. + +type + +type of model, one of regression or classification, to fit + +maxDepth + +Maximum depth of the tree (= 0). + +maxBins + +Maximum number of bins used for discretizing continuous features and for choosing +how to split on features at each node. More bins give higher granularity. Must be += 2 and = number of categories in any categorical feature. + +maxIter + +Param for maximum number of iterations (= 0). + +stepSize + +Param for Step size to be used for each iteration of optimization. + +lossType + +Loss function which GBT tries to minimize. +For classification, must be logistic. For regression, must be one of +squared (L2) and absolute (L1), default is squared. + +seed + +integer seed for random number generation. + +subsamplingRate + +Fraction of the training data used for learning each decision tree, in +range (0, 1]. + +minInstancesPerNode + +Minimum number of instances each child must have after split. If a +split causes the left or right child to have fewer than +minInstancesPerNode, the split will be discarded as invalid. Should be += 1. + +minInfoGain + +Minimum information gain for a split to be considered at a tree node. + +checkpointInterval + +Param for set checkpoint interval (= 1) or disable checkpoint (-1). + +maxMemoryInMB + +Maximum memory in MB allocated to histogram aggregation. + +cacheNodeIds + +If FALSE, the algorithm will pass trees to executors to match instances with +nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching +can speed up training of deeper trees. Users can set how often should the +cache be checkpointed or disable it by setting checkpointInterval. + +object + +A fitted Gradient Boosted Tree regression model or classification model. + +newData + +a SparkDataFrame for testing. + +path + +The directory where the model is saved. + +overwrite + +Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists. + +x + +summary object of Gradient Boosted Tree regression model or classification model +returned by summary. + + + + +Value + +spark.gbt returns a fitted Gradient Boosted Tree model. +