[
https://issues.apache.org/jira/browse/SPARK-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047262#comment-15047262
]
Joseph K. Bradley commented on SPARK-11219:
-------------------------------------------
Thanks for the careful assessment! Creating subtasks sounds good.
> Make Parameter Description Format Consistent in PySpark.MLlib
> -------------------------------------------------------------
>
> Key: SPARK-11219
> URL: https://issues.apache.org/jira/browse/SPARK-11219
> Project: Spark
> Issue Type: Documentation
> Components: Documentation, MLlib, PySpark
> Reporter: Bryan Cutler
> Priority: Trivial
>
> There are several different formats for describing params in PySpark.MLlib,
> making it unclear what the preferred way to document is, i.e. vertical
> alignment vs single line.
> This is to agree on a format and make it consistent across PySpark.MLlib.
> Following the discussion in SPARK-10560, using 2 lines with an indentation is
> both readable and doesn't lead to changing many lines when adding/removing
> parameters. If the parameter uses a default value, put this in parenthesis
> in a new line under the description.
> Example:
> {noformat}
> :param stepSize:
> Step size for each iteration of gradient descent.
> (default: 0.1)
> :param numIterations:
> Number of iterations run for each batch of data.
> (default: 50)
> {noformat}
> h2. Current State of Parameter Description Formating
> h4. Classification
> * LogisticRegressionModel - single line descriptions, fix indentations
> * LogisticRegressionWithSGD - vertical alignment, sporatic default values
> * LogisticRegressionWithLBFGS - vertical alignment, sporatic default values
> * SVMModel - single line
> * SVMWithSGD - vertical alignment, sporatic default values
> * NaiveBayesModel - single line
> * NaiveBayes - single line
> h4. Clustering
> * KMeansModel - missing param description
> * KMeans - missing param description and defaults
> * GaussianMixture - vertical align, incorrect default formatting
> * PowerIterationClustering - single line with wrapped indentation, missing
> defaults
> * StreamingKMeansModel - single line wrapped
> * StreamingKMeans - single line wrapped, missing defaults
> * LDAModel - single line
> * LDA - vertical align, mising some defaults
> h4. FPM
> * FPGrowth - single line
> * PrefixSpan - single line, defaults values in backticks
> h4. Recommendation
> * ALS - does not have param descriptions
> h4. Regression
> * LabeledPoint - single line
> * LinearModel - single line
> * LinearRegressionWithSGD - vertical alignment
> * RidgeRegressionWithSGD - vertical align
> * IsotonicRegressionModel - single line
> * IsotonicRegression - single line, missing default
> h4. Tree
> * DecisionTree - single line with vertical indentation, missing defaults
> * RandomForest - single line with wrapped indent, missing some defaults
> * GradientBoostedTrees - single line with wrapped indent
> NOTE
> This issue will just focus on model/algorithm descriptions, which are the
> largest source of inconsistent formatting
> evaluation.py, feature.py, random.py, utils.py - these supporting classes
> have param descriptions as single line, but are consistent so don't need to
> be changed
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]