http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.lda.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.lda.html b/site/docs/2.2.0/api/R/spark.lda.html new file mode 100644 index 0000000..aa1cfa8 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.lda.html @@ -0,0 +1,247 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Latent Dirichlet Allocation</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.lda {SparkR}"><tr><td>spark.lda {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Latent Dirichlet Allocation</h2> + +<h3>Description</h3> + +<p><code>spark.lda</code> fits a Latent Dirichlet Allocation model on a SparkDataFrame. Users can call +<code>summary</code> to get a summary of the fitted LDA model, <code>spark.posterior</code> to compute +posterior probabilities on new data, <code>spark.perplexity</code> to compute log perplexity on new +data and <code>write.ml</code>/<code>read.ml</code> to save/load fitted models. +</p> + + +<h3>Usage</h3> + +<pre> +spark.lda(data, ...) + +spark.posterior(object, newData) + +spark.perplexity(object, data) + +## S4 method for signature 'SparkDataFrame' +spark.lda(data, features = "features", k = 10, + maxIter = 20, optimizer = c("online", "em"), subsamplingRate = 0.05, + topicConcentration = -1, docConcentration = -1, + customizedStopWords = "", maxVocabSize = bitwShiftL(1, 18)) + +## S4 method for signature 'LDAModel' +summary(object, maxTermsPerTopic) + +## S4 method for signature 'LDAModel,SparkDataFrame' +spark.perplexity(object, data) + +## S4 method for signature 'LDAModel,SparkDataFrame' +spark.posterior(object, newData) + +## S4 method for signature 'LDAModel,character' +write.ml(object, path, overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>A SparkDataFrame for training.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s) passed to the method.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>A Latent Dirichlet Allocation model fitted by <code>spark.lda</code>.</p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>A SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>features</code></td> +<td> +<p>Features column name. Either libSVM-format column or character-format column is +valid.</p> +</td></tr> +<tr valign="top"><td><code>k</code></td> +<td> +<p>Number of topics.</p> +</td></tr> +<tr valign="top"><td><code>maxIter</code></td> +<td> +<p>Maximum iterations.</p> +</td></tr> +<tr valign="top"><td><code>optimizer</code></td> +<td> +<p>Optimizer to train an LDA model, "online" or "em", default is "online".</p> +</td></tr> +<tr valign="top"><td><code>subsamplingRate</code></td> +<td> +<p>(For online optimizer) Fraction of the corpus to be sampled and used in +each iteration of mini-batch gradient descent, in range (0, 1].</p> +</td></tr> +<tr valign="top"><td><code>topicConcentration</code></td> +<td> +<p>concentration parameter (commonly named <code>beta</code> or <code>eta</code>) for +the prior placed on topic distributions over terms, default -1 to set automatically on the +Spark side. Use <code>summary</code> to retrieve the effective topicConcentration. Only 1-size +numeric is accepted.</p> +</td></tr> +<tr valign="top"><td><code>docConcentration</code></td> +<td> +<p>concentration parameter (commonly named <code>alpha</code>) for the +prior placed on documents distributions over topics (<code>theta</code>), default -1 to set +automatically on the Spark side. Use <code>summary</code> to retrieve the effective +docConcentration. Only 1-size or <code>k</code>-size numeric is accepted.</p> +</td></tr> +<tr valign="top"><td><code>customizedStopWords</code></td> +<td> +<p>stopwords that need to be removed from the given corpus. Ignore the +parameter if libSVM-format column is used as the features column.</p> +</td></tr> +<tr valign="top"><td><code>maxVocabSize</code></td> +<td> +<p>maximum vocabulary size, default 1 << 18</p> +</td></tr> +<tr valign="top"><td><code>maxTermsPerTopic</code></td> +<td> +<p>Maximum number of terms to collect for each topic. Default value of 10.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>The directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.lda</code> returns a fitted Latent Dirichlet Allocation model. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list includes +</p> +<table summary="R valueblock"> +<tr valign="top"><td><code><code>docConcentration</code></code></td> +<td> +<p>concentration parameter commonly named <code>alpha</code> for +the prior placed on documents distributions over topics <code>theta</code></p> +</td></tr> +<tr valign="top"><td><code><code>topicConcentration</code></code></td> +<td> +<p>concentration parameter commonly named <code>beta</code> or +<code>eta</code> for the prior placed on topic distributions over terms</p> +</td></tr> +<tr valign="top"><td><code><code>logLikelihood</code></code></td> +<td> +<p>log likelihood of the entire corpus</p> +</td></tr> +<tr valign="top"><td><code><code>logPerplexity</code></code></td> +<td> +<p>log perplexity</p> +</td></tr> +<tr valign="top"><td><code><code>isDistributed</code></code></td> +<td> +<p>TRUE for distributed model while FALSE for local model</p> +</td></tr> +<tr valign="top"><td><code><code>vocabSize</code></code></td> +<td> +<p>number of terms in the corpus</p> +</td></tr> +<tr valign="top"><td><code><code>topics</code></code></td> +<td> +<p>top 10 terms and their weights of all topics</p> +</td></tr> +<tr valign="top"><td><code><code>vocabulary</code></code></td> +<td> +<p>whole terms of the training corpus, NULL if libsvm format file +used as training set</p> +</td></tr> +<tr valign="top"><td><code><code>trainingLogLikelihood</code></code></td> +<td> +<p>Log likelihood of the observed tokens in the training set, +given the current parameter estimates: +log P(docs | topics, topic distributions for docs, Dirichlet hyperparameters) +It is only for distributed LDA model (i.e., optimizer = "em")</p> +</td></tr> +<tr valign="top"><td><code><code>logPrior</code></code></td> +<td> +<p>Log probability of the current parameter estimate: +log P(topics, topic distributions for docs | Dirichlet hyperparameters) +It is only for distributed LDA model (i.e., optimizer = "em")</p> +</td></tr> +</table> +<p><code>spark.perplexity</code> returns the log perplexity of given SparkDataFrame, or the log +perplexity of the training data if missing argument "data". +</p> +<p><code>spark.posterior</code> returns a SparkDataFrame containing posterior probabilities +vectors named "topicDistribution". +</p> + + +<h3>Note</h3> + +<p>spark.lda since 2.1.0 +</p> +<p>summary(LDAModel) since 2.1.0 +</p> +<p>spark.perplexity(LDAModel) since 2.1.0 +</p> +<p>spark.posterior(LDAModel) since 2.1.0 +</p> +<p>write.ml(LDAModel, character) since 2.1.0 +</p> + + +<h3>See Also</h3> + +<p>topicmodels: <a href="https://cran.r-project.org/package=topicmodels">https://cran.r-project.org/package=topicmodels</a> +</p> +<p><a href="read.ml.html">read.ml</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D text <- read.df("data/mllib/sample_lda_libsvm_data.txt", source = "libsvm") +##D model <- spark.lda(data = text, optimizer = "em") +##D +##D # get a summary of the model +##D summary(model) +##D +##D # compute posterior probabilities +##D posterior <- spark.posterior(model, text) +##D showDF(posterior) +##D +##D # compute perplexity +##D perplexity <- spark.perplexity(model, text) +##D +##D # save and load the model +##D path <- "path/to/model" +##D write.ml(model, path) +##D savedModel <- read.ml(path) +##D summary(savedModel) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html>
http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.logit.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.logit.html b/site/docs/2.2.0/api/R/spark.logit.html new file mode 100644 index 0000000..dc0d8e2 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.logit.html @@ -0,0 +1,202 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Logistic Regression Model</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.logit {SparkR}"><tr><td>spark.logit {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Logistic Regression Model</h2> + +<h3>Description</h3> + +<p>Fits an logistic regression model against a SparkDataFrame. It supports "binomial": Binary logistic regression +with pivoting; "multinomial": Multinomial logistic (softmax) regression without pivoting, similar to glmnet. +Users can print, make predictions on the produced model and save the model to the input path. +</p> + + +<h3>Usage</h3> + +<pre> +spark.logit(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.logit(data, formula, regParam = 0, + elasticNetParam = 0, maxIter = 100, tol = 1e-06, family = "auto", + standardization = TRUE, thresholds = 0.5, weightCol = NULL, + aggregationDepth = 2) + +## S4 method for signature 'LogisticRegressionModel' +summary(object) + +## S4 method for signature 'LogisticRegressionModel' +predict(object, newData) + +## S4 method for signature 'LogisticRegressionModel,character' +write.ml(object, path, + overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>SparkDataFrame for training.</p> +</td></tr> +<tr valign="top"><td><code>formula</code></td> +<td> +<p>A symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', '.', ':', '+', and '-'.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional arguments passed to the method.</p> +</td></tr> +<tr valign="top"><td><code>regParam</code></td> +<td> +<p>the regularization parameter.</p> +</td></tr> +<tr valign="top"><td><code>elasticNetParam</code></td> +<td> +<p>the ElasticNet mixing parameter. For alpha = 0.0, the penalty is an L2 penalty. +For alpha = 1.0, it is an L1 penalty. For 0.0 < alpha < 1.0, the penalty is a combination +of L1 and L2. Default is 0.0 which is an L2 penalty.</p> +</td></tr> +<tr valign="top"><td><code>maxIter</code></td> +<td> +<p>maximum iteration number.</p> +</td></tr> +<tr valign="top"><td><code>tol</code></td> +<td> +<p>convergence tolerance of iterations.</p> +</td></tr> +<tr valign="top"><td><code>family</code></td> +<td> +<p>the name of family which is a description of the label distribution to be used in the model. +Supported options: +</p> + +<ul> +<li><p>"auto": Automatically select the family based on the number of classes: +If number of classes == 1 || number of classes == 2, set to "binomial". +Else, set to "multinomial". +</p> +</li> +<li><p>"binomial": Binary logistic regression with pivoting. +</p> +</li> +<li><p>"multinomial": Multinomial logistic (softmax) regression without pivoting. +</p> +</li></ul> +</td></tr> +<tr valign="top"><td><code>standardization</code></td> +<td> +<p>whether to standardize the training features before fitting the model. The coefficients +of models will be always returned on the original scale, so it will be transparent for +users. Note that with/without standardization, the models should be always converged +to the same solution when no regularization is applied. Default is TRUE, same as glmnet.</p> +</td></tr> +<tr valign="top"><td><code>thresholds</code></td> +<td> +<p>in binary classification, in range [0, 1]. If the estimated probability of class label 1 +is > threshold, then predict 1, else 0. A high threshold encourages the model to predict 0 +more often; a low threshold encourages the model to predict 1 more often. Note: Setting this with +threshold p is equivalent to setting thresholds c(1-p, p). In multiclass (or binary) classification to adjust the probability of +predicting each class. Array must have length equal to the number of classes, with values > 0, +excepting that at most one value may be 0. The class with largest value p/t is predicted, where p +is the original probability of that class and t is the class's threshold.</p> +</td></tr> +<tr valign="top"><td><code>weightCol</code></td> +<td> +<p>The weight column name.</p> +</td></tr> +<tr valign="top"><td><code>aggregationDepth</code></td> +<td> +<p>The depth for treeAggregate (greater than or equal to 2). If the dimensions of features +or the number of partitions are large, this param could be adjusted to a larger size. +This is an expert parameter. Default value should be good for most cases.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>an LogisticRegressionModel fitted by <code>spark.logit</code>.</p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>a SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>The directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.logit</code> returns a fitted logistic regression model. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list includes <code>coefficients</code> (coefficients matrix of the fitted model). +</p> +<p><code>predict</code> returns the predicted values based on an LogisticRegressionModel. +</p> + + +<h3>Note</h3> + +<p>spark.logit since 2.1.0 +</p> +<p>summary(LogisticRegressionModel) since 2.1.0 +</p> +<p>predict(LogisticRegressionModel) since 2.1.0 +</p> +<p>write.ml(LogisticRegression, character) since 2.1.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D # binary logistic regression +##D t <- as.data.frame(Titanic) +##D training <- createDataFrame(t) +##D model <- spark.logit(training, Survived ~ ., regParam = 0.5) +##D summary <- summary(model) +##D +##D # fitted values on training data +##D fitted <- predict(model, training) +##D +##D # save fitted model to input path +##D path <- "path/to/model" +##D write.ml(model, path) +##D +##D # can also read back the saved model and predict +##D # Note that summary deos not work on loaded model +##D savedModel <- read.ml(path) +##D summary(savedModel) +##D +##D # multinomial logistic regression +##D +##D model <- spark.logit(training, Class ~ ., regParam = 0.5) +##D summary <- summary(model) +##D +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.mlp.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.mlp.html b/site/docs/2.2.0/api/R/spark.mlp.html new file mode 100644 index 0000000..3a77288 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.mlp.html @@ -0,0 +1,181 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Multilayer Perceptron Classification Model</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.mlp {SparkR}"><tr><td>spark.mlp {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Multilayer Perceptron Classification Model</h2> + +<h3>Description</h3> + +<p><code>spark.mlp</code> fits a multi-layer perceptron neural network model against a SparkDataFrame. +Users can call <code>summary</code> to print a summary of the fitted model, <code>predict</code> to make +predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to save/load fitted models. +Only categorical data is supported. +For more details, see +<a href="http://spark.apache.org/docs/latest/ml-classification-regression.html"> +Multilayer Perceptron</a> +</p> + + +<h3>Usage</h3> + +<pre> +spark.mlp(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.mlp(data, formula, layers, + blockSize = 128, solver = "l-bfgs", maxIter = 100, tol = 1e-06, + stepSize = 0.03, seed = NULL, initialWeights = NULL) + +## S4 method for signature 'MultilayerPerceptronClassificationModel' +summary(object) + +## S4 method for signature 'MultilayerPerceptronClassificationModel' +predict(object, newData) + +## S4 method for signature 'MultilayerPerceptronClassificationModel,character' +write.ml(object, + path, overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>a <code>SparkDataFrame</code> of observations and labels for model fitting.</p> +</td></tr> +<tr valign="top"><td><code>formula</code></td> +<td> +<p>a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', '.', ':', '+', and '-'.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional arguments passed to the method.</p> +</td></tr> +<tr valign="top"><td><code>layers</code></td> +<td> +<p>integer vector containing the number of nodes for each layer.</p> +</td></tr> +<tr valign="top"><td><code>blockSize</code></td> +<td> +<p>blockSize parameter.</p> +</td></tr> +<tr valign="top"><td><code>solver</code></td> +<td> +<p>solver parameter, supported options: "gd" (minibatch gradient descent) or "l-bfgs".</p> +</td></tr> +<tr valign="top"><td><code>maxIter</code></td> +<td> +<p>maximum iteration number.</p> +</td></tr> +<tr valign="top"><td><code>tol</code></td> +<td> +<p>convergence tolerance of iterations.</p> +</td></tr> +<tr valign="top"><td><code>stepSize</code></td> +<td> +<p>stepSize parameter.</p> +</td></tr> +<tr valign="top"><td><code>seed</code></td> +<td> +<p>seed parameter for weights initialization.</p> +</td></tr> +<tr valign="top"><td><code>initialWeights</code></td> +<td> +<p>initialWeights parameter for weights initialization, it should be a +numeric vector.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>a Multilayer Perceptron Classification Model fitted by <code>spark.mlp</code></p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>a SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>the directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.mlp</code> returns a fitted Multilayer Perceptron Classification Model. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list includes <code>numOfInputs</code> (number of inputs), <code>numOfOutputs</code> +(number of outputs), <code>layers</code> (array of layer sizes including input +and output layers), and <code>weights</code> (the weights of layers). +For <code>weights</code>, it is a numeric vector with length equal to the expected +given the architecture (i.e., for 8-10-2 network, 112 connection weights). +</p> +<p><code>predict</code> returns a SparkDataFrame containing predicted labeled in a column named +"prediction". +</p> + + +<h3>Note</h3> + +<p>spark.mlp since 2.1.0 +</p> +<p>summary(MultilayerPerceptronClassificationModel) since 2.1.0 +</p> +<p>predict(MultilayerPerceptronClassificationModel) since 2.1.0 +</p> +<p>write.ml(MultilayerPerceptronClassificationModel, character) since 2.1.0 +</p> + + +<h3>See Also</h3> + +<p><a href="read.ml.html">read.ml</a> +</p> +<p><a href="write.ml.html">write.ml</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D df <- read.df("data/mllib/sample_multiclass_classification_data.txt", source = "libsvm") +##D +##D # fit a Multilayer Perceptron Classification Model +##D model <- spark.mlp(df, label ~ features, blockSize = 128, layers = c(4, 3), solver = "l-bfgs", +##D maxIter = 100, tol = 0.5, stepSize = 1, seed = 1, +##D initialWeights = c(0, 0, 0, 0, 0, 5, 5, 5, 5, 5, 9, 9, 9, 9, 9)) +##D +##D # get the summary of the model +##D summary(model) +##D +##D # make predictions +##D predictions <- predict(model, df) +##D +##D # save and load the model +##D path <- "path/to/model" +##D write.ml(model, path) +##D savedModel <- read.ml(path) +##D summary(savedModel) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.naiveBayes.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.naiveBayes.html b/site/docs/2.2.0/api/R/spark.naiveBayes.html new file mode 100644 index 0000000..36ab890 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.naiveBayes.html @@ -0,0 +1,144 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Naive Bayes Models</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.naiveBayes {SparkR}"><tr><td>spark.naiveBayes {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Naive Bayes Models</h2> + +<h3>Description</h3> + +<p><code>spark.naiveBayes</code> fits a Bernoulli naive Bayes model against a SparkDataFrame. +Users can call <code>summary</code> to print a summary of the fitted model, <code>predict</code> to make +predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to save/load fitted models. +Only categorical data is supported. +</p> + + +<h3>Usage</h3> + +<pre> +spark.naiveBayes(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.naiveBayes(data, formula, + smoothing = 1) + +## S4 method for signature 'NaiveBayesModel' +summary(object) + +## S4 method for signature 'NaiveBayesModel' +predict(object, newData) + +## S4 method for signature 'NaiveBayesModel,character' +write.ml(object, path, + overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>a <code>SparkDataFrame</code> of observations and labels for model fitting.</p> +</td></tr> +<tr valign="top"><td><code>formula</code></td> +<td> +<p>a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', '.', ':', '+', and '-'.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional argument(s) passed to the method. Currently only <code>smoothing</code>.</p> +</td></tr> +<tr valign="top"><td><code>smoothing</code></td> +<td> +<p>smoothing parameter.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>a naive Bayes model fitted by <code>spark.naiveBayes</code>.</p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>a SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>the directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.naiveBayes</code> returns a fitted naive Bayes model. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list includes <code>apriori</code> (the label distribution) and +<code>tables</code> (conditional probabilities given the target label). +</p> +<p><code>predict</code> returns a SparkDataFrame containing predicted labeled in a column named +"prediction". +</p> + + +<h3>Note</h3> + +<p>spark.naiveBayes since 2.0.0 +</p> +<p>summary(NaiveBayesModel) since 2.0.0 +</p> +<p>predict(NaiveBayesModel) since 2.0.0 +</p> +<p>write.ml(NaiveBayesModel, character) since 2.0.0 +</p> + + +<h3>See Also</h3> + +<p>e1071: <a href="https://cran.r-project.org/package=e1071">https://cran.r-project.org/package=e1071</a> +</p> +<p><a href="write.ml.html">write.ml</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D data <- as.data.frame(UCBAdmissions) +##D df <- createDataFrame(data) +##D +##D # fit a Bernoulli naive Bayes model +##D model <- spark.naiveBayes(df, Admit ~ Gender + Dept, smoothing = 0) +##D +##D # get the summary of the model +##D summary(model) +##D +##D # make predictions +##D predictions <- predict(model, df) +##D +##D # save and load the model +##D path <- "path/to/model" +##D write.ml(model, path) +##D savedModel <- read.ml(path) +##D summary(savedModel) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.randomForest.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.randomForest.html b/site/docs/2.2.0/api/R/spark.randomForest.html new file mode 100644 index 0000000..0d2e001 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.randomForest.html @@ -0,0 +1,238 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Random Forest Model for Regression and Classification</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.randomForest {SparkR}"><tr><td>spark.randomForest {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Random Forest Model for Regression and Classification</h2> + +<h3>Description</h3> + +<p><code>spark.randomForest</code> fits a Random Forest Regression model or Classification model on +a SparkDataFrame. Users can call <code>summary</code> to get a summary of the fitted Random Forest +model, <code>predict</code> to make predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to +save/load fitted models. +For more details, see +<a href="http://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-regression"> +Random Forest Regression</a> and +<a href="http://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier"> +Random Forest Classification</a> +</p> + + +<h3>Usage</h3> + +<pre> +spark.randomForest(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.randomForest(data, formula, + type = c("regression", "classification"), maxDepth = 5, maxBins = 32, + numTrees = 20, impurity = NULL, featureSubsetStrategy = "auto", + seed = NULL, subsamplingRate = 1, minInstancesPerNode = 1, + minInfoGain = 0, checkpointInterval = 10, maxMemoryInMB = 256, + cacheNodeIds = FALSE) + +## S4 method for signature 'RandomForestRegressionModel' +summary(object) + +## S3 method for class 'summary.RandomForestRegressionModel' +print(x, ...) + +## S4 method for signature 'RandomForestClassificationModel' +summary(object) + +## S3 method for class 'summary.RandomForestClassificationModel' +print(x, ...) + +## S4 method for signature 'RandomForestRegressionModel' +predict(object, newData) + +## S4 method for signature 'RandomForestClassificationModel' +predict(object, newData) + +## S4 method for signature 'RandomForestRegressionModel,character' +write.ml(object, path, + overwrite = FALSE) + +## S4 method for signature 'RandomForestClassificationModel,character' +write.ml(object, path, + overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>a SparkDataFrame for training.</p> +</td></tr> +<tr valign="top"><td><code>formula</code></td> +<td> +<p>a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', ':', '+', and '-'.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional arguments passed to the method.</p> +</td></tr> +<tr valign="top"><td><code>type</code></td> +<td> +<p>type of model, one of "regression" or "classification", to fit</p> +</td></tr> +<tr valign="top"><td><code>maxDepth</code></td> +<td> +<p>Maximum depth of the tree (>= 0).</p> +</td></tr> +<tr valign="top"><td><code>maxBins</code></td> +<td> +<p>Maximum number of bins used for discretizing continuous features and for choosing +how to split on features at each node. More bins give higher granularity. Must be +>= 2 and >= number of categories in any categorical feature.</p> +</td></tr> +<tr valign="top"><td><code>numTrees</code></td> +<td> +<p>Number of trees to train (>= 1).</p> +</td></tr> +<tr valign="top"><td><code>impurity</code></td> +<td> +<p>Criterion used for information gain calculation. +For regression, must be "variance". For classification, must be one of +"entropy" and "gini", default is "gini".</p> +</td></tr> +<tr valign="top"><td><code>featureSubsetStrategy</code></td> +<td> +<p>The number of features to consider for splits at each tree node. +Supported options: "auto", "all", "onethird", "sqrt", "log2", (0.0-1.0], [1-n].</p> +</td></tr> +<tr valign="top"><td><code>seed</code></td> +<td> +<p>integer seed for random number generation.</p> +</td></tr> +<tr valign="top"><td><code>subsamplingRate</code></td> +<td> +<p>Fraction of the training data used for learning each decision tree, in +range (0, 1].</p> +</td></tr> +<tr valign="top"><td><code>minInstancesPerNode</code></td> +<td> +<p>Minimum number of instances each child must have after split.</p> +</td></tr> +<tr valign="top"><td><code>minInfoGain</code></td> +<td> +<p>Minimum information gain for a split to be considered at a tree node.</p> +</td></tr> +<tr valign="top"><td><code>checkpointInterval</code></td> +<td> +<p>Param for set checkpoint interval (>= 1) or disable checkpoint (-1).</p> +</td></tr> +<tr valign="top"><td><code>maxMemoryInMB</code></td> +<td> +<p>Maximum memory in MB allocated to histogram aggregation.</p> +</td></tr> +<tr valign="top"><td><code>cacheNodeIds</code></td> +<td> +<p>If FALSE, the algorithm will pass trees to executors to match instances with +nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching +can speed up training of deeper trees. Users can set how often should the +cache be checkpointed or disable it by setting checkpointInterval.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>A fitted Random Forest regression model or classification model.</p> +</td></tr> +<tr valign="top"><td><code>x</code></td> +<td> +<p>summary object of Random Forest regression model or classification model +returned by <code>summary</code>.</p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>a SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>The directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.randomForest</code> returns a fitted Random Forest model. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list of components includes <code>formula</code> (formula), +<code>numFeatures</code> (number of features), <code>features</code> (list of features), +<code>featureImportances</code> (feature importances), <code>maxDepth</code> (max depth of trees), +<code>numTrees</code> (number of trees), and <code>treeWeights</code> (tree weights). +</p> +<p><code>predict</code> returns a SparkDataFrame containing predicted labeled in a column named +"prediction". +</p> + + +<h3>Note</h3> + +<p>spark.randomForest since 2.1.0 +</p> +<p>summary(RandomForestRegressionModel) since 2.1.0 +</p> +<p>print.summary.RandomForestRegressionModel since 2.1.0 +</p> +<p>summary(RandomForestClassificationModel) since 2.1.0 +</p> +<p>print.summary.RandomForestClassificationModel since 2.1.0 +</p> +<p>predict(RandomForestRegressionModel) since 2.1.0 +</p> +<p>predict(RandomForestClassificationModel) since 2.1.0 +</p> +<p>write.ml(RandomForestRegressionModel, character) since 2.1.0 +</p> +<p>write.ml(RandomForestClassificationModel, character) since 2.1.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D # fit a Random Forest Regression Model +##D df <- createDataFrame(longley) +##D model <- spark.randomForest(df, Employed ~ ., type = "regression", maxDepth = 5, maxBins = 16) +##D +##D # get the summary of the model +##D summary(model) +##D +##D # make predictions +##D predictions <- predict(model, df) +##D +##D # save and load the model +##D path <- "path/to/model" +##D write.ml(model, path) +##D savedModel <- read.ml(path) +##D summary(savedModel) +##D +##D # fit a Random Forest Classification Model +##D t <- as.data.frame(Titanic) +##D df <- createDataFrame(t) +##D model <- spark.randomForest(df, Survived ~ Freq + Age, "classification") +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.survreg.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.survreg.html b/site/docs/2.2.0/api/R/spark.survreg.html new file mode 100644 index 0000000..beea2e7 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.survreg.html @@ -0,0 +1,145 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Accelerated Failure Time (AFT) Survival Regression Model</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.survreg {SparkR}"><tr><td>spark.survreg {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Accelerated Failure Time (AFT) Survival Regression Model</h2> + +<h3>Description</h3> + +<p><code>spark.survreg</code> fits an accelerated failure time (AFT) survival regression model on +a SparkDataFrame. Users can call <code>summary</code> to get a summary of the fitted AFT model, +<code>predict</code> to make predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to +save/load fitted models. +</p> + + +<h3>Usage</h3> + +<pre> +spark.survreg(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.survreg(data, formula, + aggregationDepth = 2) + +## S4 method for signature 'AFTSurvivalRegressionModel' +summary(object) + +## S4 method for signature 'AFTSurvivalRegressionModel' +predict(object, newData) + +## S4 method for signature 'AFTSurvivalRegressionModel,character' +write.ml(object, path, + overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>a SparkDataFrame for training.</p> +</td></tr> +<tr valign="top"><td><code>formula</code></td> +<td> +<p>a symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', ':', '+', and '-'. +Note that operator '.' is not supported currently.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional arguments passed to the method.</p> +</td></tr> +<tr valign="top"><td><code>aggregationDepth</code></td> +<td> +<p>The depth for treeAggregate (greater than or equal to 2). If the dimensions of features +or the number of partitions are large, this param could be adjusted to a larger size. +This is an expert parameter. Default value should be good for most cases.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>a fitted AFT survival regression model.</p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>a SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>the directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.survreg</code> returns a fitted AFT survival regression model. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list includes the model's <code>coefficients</code> (features, coefficients, +intercept and log(scale)). +</p> +<p><code>predict</code> returns a SparkDataFrame containing predicted values +on the original scale of the data (mean predicted value at scale = 1.0). +</p> + + +<h3>Note</h3> + +<p>spark.survreg since 2.0.0 +</p> +<p>summary(AFTSurvivalRegressionModel) since 2.0.0 +</p> +<p>predict(AFTSurvivalRegressionModel) since 2.0.0 +</p> +<p>write.ml(AFTSurvivalRegressionModel, character) since 2.0.0 +</p> + + +<h3>See Also</h3> + +<p>survival: <a href="https://cran.r-project.org/package=survival">https://cran.r-project.org/package=survival</a> +</p> +<p><a href="write.ml.html">write.ml</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D df <- createDataFrame(ovarian) +##D model <- spark.survreg(df, Surv(futime, fustat) ~ ecog_ps + rx) +##D +##D # get a summary of the model +##D summary(model) +##D +##D # make predictions +##D predicted <- predict(model, df) +##D showDF(predicted) +##D +##D # save and load the model +##D path <- "path/to/model" +##D write.ml(model, path) +##D savedModel <- read.ml(path) +##D summary(savedModel) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark.svmLinear.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark.svmLinear.html b/site/docs/2.2.0/api/R/spark.svmLinear.html new file mode 100644 index 0000000..3f33bbf --- /dev/null +++ b/site/docs/2.2.0/api/R/spark.svmLinear.html @@ -0,0 +1,165 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Linear SVM Model</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark.svmLinear {SparkR}"><tr><td>spark.svmLinear {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Linear SVM Model</h2> + +<h3>Description</h3> + +<p>Fits a linear SVM model against a SparkDataFrame, similar to svm in e1071 package. +Currently only supports binary classification model with linear kernel. +Users can print, make predictions on the produced model and save the model to the input path. +</p> + + +<h3>Usage</h3> + +<pre> +spark.svmLinear(data, formula, ...) + +## S4 method for signature 'SparkDataFrame,formula' +spark.svmLinear(data, formula, + regParam = 0, maxIter = 100, tol = 1e-06, standardization = TRUE, + threshold = 0, weightCol = NULL, aggregationDepth = 2) + +## S4 method for signature 'LinearSVCModel' +predict(object, newData) + +## S4 method for signature 'LinearSVCModel' +summary(object) + +## S4 method for signature 'LinearSVCModel,character' +write.ml(object, path, overwrite = FALSE) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>data</code></td> +<td> +<p>SparkDataFrame for training.</p> +</td></tr> +<tr valign="top"><td><code>formula</code></td> +<td> +<p>A symbolic description of the model to be fitted. Currently only a few formula +operators are supported, including '~', '.', ':', '+', and '-'.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>additional arguments passed to the method.</p> +</td></tr> +<tr valign="top"><td><code>regParam</code></td> +<td> +<p>The regularization parameter. Only supports L2 regularization currently.</p> +</td></tr> +<tr valign="top"><td><code>maxIter</code></td> +<td> +<p>Maximum iteration number.</p> +</td></tr> +<tr valign="top"><td><code>tol</code></td> +<td> +<p>Convergence tolerance of iterations.</p> +</td></tr> +<tr valign="top"><td><code>standardization</code></td> +<td> +<p>Whether to standardize the training features before fitting the model. The coefficients +of models will be always returned on the original scale, so it will be transparent for +users. Note that with/without standardization, the models should be always converged +to the same solution when no regularization is applied.</p> +</td></tr> +<tr valign="top"><td><code>threshold</code></td> +<td> +<p>The threshold in binary classification applied to the linear model prediction. +This threshold can be any real number, where Inf will make all predictions 0.0 +and -Inf will make all predictions 1.0.</p> +</td></tr> +<tr valign="top"><td><code>weightCol</code></td> +<td> +<p>The weight column name.</p> +</td></tr> +<tr valign="top"><td><code>aggregationDepth</code></td> +<td> +<p>The depth for treeAggregate (greater than or equal to 2). If the dimensions of features +or the number of partitions are large, this param could be adjusted to a larger size. +This is an expert parameter. Default value should be good for most cases.</p> +</td></tr> +<tr valign="top"><td><code>object</code></td> +<td> +<p>a LinearSVCModel fitted by <code>spark.svmLinear</code>.</p> +</td></tr> +<tr valign="top"><td><code>newData</code></td> +<td> +<p>a SparkDataFrame for testing.</p> +</td></tr> +<tr valign="top"><td><code>path</code></td> +<td> +<p>The directory where the model is saved.</p> +</td></tr> +<tr valign="top"><td><code>overwrite</code></td> +<td> +<p>Overwrites or not if the output path already exists. Default is FALSE +which means throw exception if the output path exists.</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p><code>spark.svmLinear</code> returns a fitted linear SVM model. +</p> +<p><code>predict</code> returns the predicted values based on a LinearSVCModel. +</p> +<p><code>summary</code> returns summary information of the fitted model, which is a list. +The list includes <code>coefficients</code> (coefficients of the fitted model), +<code>numClasses</code> (number of classes), <code>numFeatures</code> (number of features). +</p> + + +<h3>Note</h3> + +<p>spark.svmLinear since 2.2.0 +</p> +<p>predict(LinearSVCModel) since 2.2.0 +</p> +<p>summary(LinearSVCModel) since 2.2.0 +</p> +<p>write.ml(LogisticRegression, character) since 2.2.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D t <- as.data.frame(Titanic) +##D training <- createDataFrame(t) +##D model <- spark.svmLinear(training, Survived ~ ., regParam = 0.5) +##D summary <- summary(model) +##D +##D # fitted values on training data +##D fitted <- predict(model, training) +##D +##D # save fitted model to input path +##D path <- "path/to/model" +##D write.ml(model, path) +##D +##D # can also read back the saved model and predict +##D # Note that summary deos not work on loaded model +##D savedModel <- read.ml(path) +##D summary(savedModel) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.callJMethod.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.callJMethod.html b/site/docs/2.2.0/api/R/sparkR.callJMethod.html new file mode 100644 index 0000000..10308bf --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.callJMethod.html @@ -0,0 +1,91 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Call Java Methods</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.callJMethod {SparkR}"><tr><td>sparkR.callJMethod {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Call Java Methods</h2> + +<h3>Description</h3> + +<p>Call a Java method in the JVM running the Spark driver. The return +values are automatically converted to R objects for simple objects. Other +values are returned as "jobj" which are references to objects on JVM. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.callJMethod(x, methodName, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>object to invoke the method on. Should be a "jobj" created by newJObject.</p> +</td></tr> +<tr valign="top"><td><code>methodName</code></td> +<td> +<p>method name to call.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>parameters to pass to the Java method.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>This is a low level function to access the JVM directly and should only be used +for advanced use cases. The arguments and return values that are primitive R +types (like integer, numeric, character, lists) are automatically translated to/from +Java types (like Integer, Double, String, Array). A full list can be found in +serialize.R and deserialize.R in the Apache Spark code base. +</p> + + +<h3>Value</h3> + +<p>the return value of the Java method. Either returned as a R object +if it can be deserialized or returned as a "jobj". See details section for more. +</p> + + +<h3>Note</h3> + +<p>sparkR.callJMethod since 2.0.1 +</p> + + +<h3>See Also</h3> + +<p><a href="sparkR.callJStatic.html">sparkR.callJStatic</a>, <a href="sparkR.newJObject.html">sparkR.newJObject</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() # Need to have a Spark JVM running before calling newJObject +##D # Create a Java ArrayList and populate it +##D jarray <- sparkR.newJObject("java.util.ArrayList") +##D sparkR.callJMethod(jarray, "add", 42L) +##D sparkR.callJMethod(jarray, "get", 0L) # Will print 42 +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.callJStatic.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.callJStatic.html b/site/docs/2.2.0/api/R/sparkR.callJStatic.html new file mode 100644 index 0000000..f7d61b4 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.callJStatic.html @@ -0,0 +1,89 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Call Static Java Methods</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.callJStatic {SparkR}"><tr><td>sparkR.callJStatic {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Call Static Java Methods</h2> + +<h3>Description</h3> + +<p>Call a static method in the JVM running the Spark driver. The return +value is automatically converted to R objects for simple objects. Other +values are returned as "jobj" which are references to objects on JVM. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.callJStatic(x, methodName, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>fully qualified Java class name that contains the static method to invoke.</p> +</td></tr> +<tr valign="top"><td><code>methodName</code></td> +<td> +<p>name of static method to invoke.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>parameters to pass to the Java method.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>This is a low level function to access the JVM directly and should only be used +for advanced use cases. The arguments and return values that are primitive R +types (like integer, numeric, character, lists) are automatically translated to/from +Java types (like Integer, Double, String, Array). A full list can be found in +serialize.R and deserialize.R in the Apache Spark code base. +</p> + + +<h3>Value</h3> + +<p>the return value of the Java method. Either returned as a R object +if it can be deserialized or returned as a "jobj". See details section for more. +</p> + + +<h3>Note</h3> + +<p>sparkR.callJStatic since 2.0.1 +</p> + + +<h3>See Also</h3> + +<p><a href="sparkR.callJMethod.html">sparkR.callJMethod</a>, <a href="sparkR.newJObject.html">sparkR.newJObject</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() # Need to have a Spark JVM running before calling callJStatic +##D sparkR.callJStatic("java.lang.System", "currentTimeMillis") +##D sparkR.callJStatic("java.lang.System", "getProperty", "java.home") +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.conf.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.conf.html b/site/docs/2.2.0/api/R/sparkR.conf.html new file mode 100644 index 0000000..5dc9862 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.conf.html @@ -0,0 +1,69 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Get Runtime Config from the current active SparkSession</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.conf {SparkR}"><tr><td>sparkR.conf {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Get Runtime Config from the current active SparkSession</h2> + +<h3>Description</h3> + +<p>Get Runtime Config from the current active SparkSession. +To change SparkSession Runtime Config, please see <code>sparkR.session()</code>. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.conf(key, defaultValue) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>key</code></td> +<td> +<p>(optional) The key of the config to get, if omitted, all config is returned</p> +</td></tr> +<tr valign="top"><td><code>defaultValue</code></td> +<td> +<p>(optional) The default value of the config to return if they config is not +set, if omitted, the call fails if the config key is not set</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p>a list of config values with keys as their names +</p> + + +<h3>Note</h3> + +<p>sparkR.conf since 2.0.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D allConfigs <- sparkR.conf() +##D masterValue <- unlist(sparkR.conf("spark.master")) +##D namedConfig <- sparkR.conf("spark.executor.memory", "0g") +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.init-deprecated.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.init-deprecated.html b/site/docs/2.2.0/api/R/sparkR.init-deprecated.html new file mode 100644 index 0000000..1dc26e2 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.init-deprecated.html @@ -0,0 +1,93 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: (Deprecated) Initialize a new Spark Context</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.init {SparkR}"><tr><td>sparkR.init {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>(Deprecated) Initialize a new Spark Context</h2> + +<h3>Description</h3> + +<p>This function initializes a new SparkContext. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.init(master = "", appName = "SparkR", + sparkHome = Sys.getenv("SPARK_HOME"), sparkEnvir = list(), + sparkExecutorEnv = list(), sparkJars = "", sparkPackages = "") +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>master</code></td> +<td> +<p>The Spark master URL</p> +</td></tr> +<tr valign="top"><td><code>appName</code></td> +<td> +<p>Application name to register with cluster manager</p> +</td></tr> +<tr valign="top"><td><code>sparkHome</code></td> +<td> +<p>Spark Home directory</p> +</td></tr> +<tr valign="top"><td><code>sparkEnvir</code></td> +<td> +<p>Named list of environment variables to set on worker nodes</p> +</td></tr> +<tr valign="top"><td><code>sparkExecutorEnv</code></td> +<td> +<p>Named list of environment variables to be used when launching executors</p> +</td></tr> +<tr valign="top"><td><code>sparkJars</code></td> +<td> +<p>Character vector of jar files to pass to the worker nodes</p> +</td></tr> +<tr valign="top"><td><code>sparkPackages</code></td> +<td> +<p>Character vector of package coordinates</p> +</td></tr> +</table> + + +<h3>Note</h3> + +<p>sparkR.init since 1.4.0 +</p> + + +<h3>See Also</h3> + +<p><a href="sparkR.session.html">sparkR.session</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sc <- sparkR.init("local[2]", "SparkR", "/home/spark") +##D sc <- sparkR.init("local[2]", "SparkR", "/home/spark", +##D list(spark.executor.memory="1g")) +##D sc <- sparkR.init("yarn-client", "SparkR", "/home/spark", +##D list(spark.executor.memory="4g"), +##D list(LD_LIBRARY_PATH="/directory of JVM libraries (libjvm.so) on workers/"), +##D c("one.jar", "two.jar", "three.jar"), +##D c("com.databricks:spark-avro_2.10:2.0.1")) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.newJObject.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.newJObject.html b/site/docs/2.2.0/api/R/sparkR.newJObject.html new file mode 100644 index 0000000..6ee337e --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.newJObject.html @@ -0,0 +1,87 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Create Java Objects</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.newJObject {SparkR}"><tr><td>sparkR.newJObject {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Create Java Objects</h2> + +<h3>Description</h3> + +<p>Create a new Java object in the JVM running the Spark driver. The return +value is automatically converted to an R object for simple objects. Other +values are returned as a "jobj" which is a reference to an object on JVM. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.newJObject(x, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>fully qualified Java class name.</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>arguments to be passed to the constructor.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>This is a low level function to access the JVM directly and should only be used +for advanced use cases. The arguments and return values that are primitive R +types (like integer, numeric, character, lists) are automatically translated to/from +Java types (like Integer, Double, String, Array). A full list can be found in +serialize.R and deserialize.R in the Apache Spark code base. +</p> + + +<h3>Value</h3> + +<p>the object created. Either returned as a R object +if it can be deserialized or returned as a "jobj". See details section for more. +</p> + + +<h3>Note</h3> + +<p>sparkR.newJObject since 2.0.1 +</p> + + +<h3>See Also</h3> + +<p><a href="sparkR.callJMethod.html">sparkR.callJMethod</a>, <a href="sparkR.callJStatic.html">sparkR.callJStatic</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() # Need to have a Spark JVM running before calling newJObject +##D # Create a Java ArrayList and populate it +##D jarray <- sparkR.newJObject("java.util.ArrayList") +##D sparkR.callJMethod(jarray, "add", 42L) +##D sparkR.callJMethod(jarray, "get", 0L) # Will print 42 +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.session.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.session.html b/site/docs/2.2.0/api/R/sparkR.session.html new file mode 100644 index 0000000..0b1076b --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.session.html @@ -0,0 +1,115 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Get the existing SparkSession or initialize a new...</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.session {SparkR}"><tr><td>sparkR.session {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Get the existing SparkSession or initialize a new SparkSession.</h2> + +<h3>Description</h3> + +<p>SparkSession is the entry point into SparkR. <code>sparkR.session</code> gets the existing +SparkSession or initializes a new SparkSession. +Additional Spark properties can be set in <code>...</code>, and these named parameters take priority +over values in <code>master</code>, <code>appName</code>, named lists of <code>sparkConfig</code>. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.session(master = "", appName = "SparkR", + sparkHome = Sys.getenv("SPARK_HOME"), sparkConfig = list(), + sparkJars = "", sparkPackages = "", enableHiveSupport = TRUE, ...) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>master</code></td> +<td> +<p>the Spark master URL.</p> +</td></tr> +<tr valign="top"><td><code>appName</code></td> +<td> +<p>application name to register with cluster manager.</p> +</td></tr> +<tr valign="top"><td><code>sparkHome</code></td> +<td> +<p>Spark Home directory.</p> +</td></tr> +<tr valign="top"><td><code>sparkConfig</code></td> +<td> +<p>named list of Spark configuration to set on worker nodes.</p> +</td></tr> +<tr valign="top"><td><code>sparkJars</code></td> +<td> +<p>character vector of jar files to pass to the worker nodes.</p> +</td></tr> +<tr valign="top"><td><code>sparkPackages</code></td> +<td> +<p>character vector of package coordinates</p> +</td></tr> +<tr valign="top"><td><code>enableHiveSupport</code></td> +<td> +<p>enable support for Hive, fallback if not built with Hive support; once +set, this cannot be turned off on an existing session</p> +</td></tr> +<tr valign="top"><td><code>...</code></td> +<td> +<p>named Spark properties passed to the method.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>When called in an interactive session, this method checks for the Spark installation, and, if not +found, it will be downloaded and cached automatically. Alternatively, <code>install.spark</code> can +be called manually. +</p> +<p>A default warehouse is created automatically in the current directory when a managed table is +created via <code>sql</code> statement <code>CREATE TABLE</code>, for example. To change the location of the +warehouse, set the named parameter <code>spark.sql.warehouse.dir</code> to the SparkSession. Along with +the warehouse, an accompanied metastore may also be automatically created in the current +directory when a new SparkSession is initialized with <code>enableHiveSupport</code> set to +<code>TRUE</code>, which is the default. For more details, refer to Hive configuration at +<a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables">http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables</a>. +</p> +<p>For details on how to initialize and use SparkR, refer to SparkR programming guide at +<a href="http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession">http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession</a>. +</p> + + +<h3>Note</h3> + +<p>sparkR.session since 2.0.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D df <- read.json(path) +##D +##D sparkR.session("local[2]", "SparkR", "/home/spark") +##D sparkR.session("yarn-client", "SparkR", "/home/spark", +##D list(spark.executor.memory="4g"), +##D c("one.jar", "two.jar", "three.jar"), +##D c("com.databricks:spark-avro_2.10:2.0.1")) +##D sparkR.session(spark.master = "yarn-client", spark.executor.memory = "4g") +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.session.stop.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.session.stop.html b/site/docs/2.2.0/api/R/sparkR.session.stop.html new file mode 100644 index 0000000..fc15cb6 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.session.stop.html @@ -0,0 +1,40 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Stop the Spark Session and Spark Context</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> +</head><body> + +<table width="100%" summary="page for sparkR.session.stop {SparkR}"><tr><td>sparkR.session.stop {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Stop the Spark Session and Spark Context</h2> + +<h3>Description</h3> + +<p>Stop the Spark Session and Spark Context. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.session.stop() + +sparkR.stop() +</pre> + + +<h3>Details</h3> + +<p>Also terminates the backend this R session is connected to. +</p> + + +<h3>Note</h3> + +<p>sparkR.session.stop since 2.0.0 +</p> +<p>sparkR.stop since 1.4.0 +</p> + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.uiWebUrl.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.uiWebUrl.html b/site/docs/2.2.0/api/R/sparkR.uiWebUrl.html new file mode 100644 index 0000000..ad8b700 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.uiWebUrl.html @@ -0,0 +1,51 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Get the URL of the SparkUI instance for the current active...</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.uiWebUrl {SparkR}"><tr><td>sparkR.uiWebUrl {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Get the URL of the SparkUI instance for the current active SparkSession</h2> + +<h3>Description</h3> + +<p>Get the URL of the SparkUI instance for the current active SparkSession. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.uiWebUrl() +</pre> + + +<h3>Value</h3> + +<p>the SparkUI URL, or NA if it is disabled, or not started. +</p> + + +<h3>Note</h3> + +<p>sparkR.uiWebUrl since 2.1.1 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D url <- sparkR.uiWebUrl() +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkR.version.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkR.version.html b/site/docs/2.2.0/api/R/sparkR.version.html new file mode 100644 index 0000000..925ea42 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkR.version.html @@ -0,0 +1,51 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Get version of Spark on which this application is running</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkR.version {SparkR}"><tr><td>sparkR.version {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Get version of Spark on which this application is running</h2> + +<h3>Description</h3> + +<p>Get version of Spark on which this application is running. +</p> + + +<h3>Usage</h3> + +<pre> +sparkR.version() +</pre> + + +<h3>Value</h3> + +<p>a character string of the Spark version +</p> + + +<h3>Note</h3> + +<p>sparkR.version since 2.0.1 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D version <- sparkR.version() +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkRHive.init-deprecated.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkRHive.init-deprecated.html b/site/docs/2.2.0/api/R/sparkRHive.init-deprecated.html new file mode 100644 index 0000000..3ee250d --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkRHive.init-deprecated.html @@ -0,0 +1,68 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: (Deprecated) Initialize a new HiveContext</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkRHive.init {SparkR}"><tr><td>sparkRHive.init {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>(Deprecated) Initialize a new HiveContext</h2> + +<h3>Description</h3> + +<p>This function creates a HiveContext from an existing JavaSparkContext +</p> + + +<h3>Usage</h3> + +<pre> +sparkRHive.init(jsc = NULL) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>jsc</code></td> +<td> +<p>The existing JavaSparkContext created with SparkR.init()</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>Starting SparkR 2.0, a SparkSession is initialized and returned instead. +This API is deprecated and kept for backward compatibility only. +</p> + + +<h3>Note</h3> + +<p>sparkRHive.init since 1.4.0 +</p> + + +<h3>See Also</h3> + +<p><a href="sparkR.session.html">sparkR.session</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sc <- sparkR.init() +##D sqlContext <- sparkRHive.init(sc) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sparkRSQL.init-deprecated.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sparkRSQL.init-deprecated.html b/site/docs/2.2.0/api/R/sparkRSQL.init-deprecated.html new file mode 100644 index 0000000..0653fb9 --- /dev/null +++ b/site/docs/2.2.0/api/R/sparkRSQL.init-deprecated.html @@ -0,0 +1,69 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: (Deprecated) Initialize a new SQLContext</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sparkRSQL.init {SparkR}"><tr><td>sparkRSQL.init {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>(Deprecated) Initialize a new SQLContext</h2> + +<h3>Description</h3> + +<p>This function creates a SparkContext from an existing JavaSparkContext and +then uses it to initialize a new SQLContext +</p> + + +<h3>Usage</h3> + +<pre> +sparkRSQL.init(jsc = NULL) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>jsc</code></td> +<td> +<p>The existing JavaSparkContext created with SparkR.init()</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>Starting SparkR 2.0, a SparkSession is initialized and returned instead. +This API is deprecated and kept for backward compatibility only. +</p> + + +<h3>Note</h3> + +<p>sparkRSQL.init since 1.4.0 +</p> + + +<h3>See Also</h3> + +<p><a href="sparkR.session.html">sparkR.session</a> +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sc <- sparkR.init() +##D sqlContext <- sparkRSQL.init(sc) +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/spark_partition_id.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/spark_partition_id.html b/site/docs/2.2.0/api/R/spark_partition_id.html new file mode 100644 index 0000000..d17e154 --- /dev/null +++ b/site/docs/2.2.0/api/R/spark_partition_id.html @@ -0,0 +1,63 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: Return the partition ID as a column</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for spark_partition_id {SparkR}"><tr><td>spark_partition_id {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>Return the partition ID as a column</h2> + +<h3>Description</h3> + +<p>Return the partition ID as a SparkDataFrame column. +Note that this is nondeterministic because it depends on data partitioning and +task scheduling. +</p> + + +<h3>Usage</h3> + +<pre> +## S4 method for signature 'missing' +spark_partition_id() + +spark_partition_id(x = "missing") +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>x</code></td> +<td> +<p>empty. Should be used with no argument.</p> +</td></tr> +</table> + + +<h3>Details</h3> + +<p>This is equivalent to the SPARK_PARTITION_ID function in SQL. +</p> + + +<h3>Note</h3> + +<p>spark_partition_id since 2.0.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: select(df, spark_partition_id()) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> http://git-wip-us.apache.org/repos/asf/spark-website/blob/f7ec1155/site/docs/2.2.0/api/R/sql.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.0/api/R/sql.html b/site/docs/2.2.0/api/R/sql.html new file mode 100644 index 0000000..fb81cb7 --- /dev/null +++ b/site/docs/2.2.0/api/R/sql.html @@ -0,0 +1,65 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html><head><title>R: SQL Query</title> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<link rel="stylesheet" type="text/css" href="R.css"> + +<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> +<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> +<script>hljs.initHighlightingOnLoad();</script> +</head><body> + +<table width="100%" summary="page for sql {SparkR}"><tr><td>sql {SparkR}</td><td align="right">R Documentation</td></tr></table> + +<h2>SQL Query</h2> + +<h3>Description</h3> + +<p>Executes a SQL query using Spark, returning the result as a SparkDataFrame. +</p> + + +<h3>Usage</h3> + +<pre> +## Default S3 method: +sql(sqlQuery) +</pre> + + +<h3>Arguments</h3> + +<table summary="R argblock"> +<tr valign="top"><td><code>sqlQuery</code></td> +<td> +<p>A character vector containing the SQL query</p> +</td></tr> +</table> + + +<h3>Value</h3> + +<p>SparkDataFrame +</p> + + +<h3>Note</h3> + +<p>sql since 1.4.0 +</p> + + +<h3>Examples</h3> + +<pre><code class="r">## Not run: +##D sparkR.session() +##D path <- "path/to/file.json" +##D df <- read.json(path) +##D createOrReplaceTempView(df, "table") +##D new_df <- sql("SELECT * FROM table") +## End(Not run) +</code></pre> + + +<hr><div align="center">[Package <em>SparkR</em> version 2.2.0 <a href="00Index.html">Index</a>]</div> +</body></html> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org