[ https://issues.apache.org/jira/browse/SPARK-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Cutler updated SPARK-11219: --------------------------------- Description: There are several different formats for describing params in PySpark.MLlib, making it unclear what the preferred way to document is, i.e. vertical alignment vs single line. This is to agree on a format and make it consistent across PySpark.MLlib. Following the discussion in SPARK-10560, using 2 lines with an indentation is both readable and doesn't lead to changing many lines when adding/removing parameters. If the parameter uses a default value, put this in parenthesis in a new line under the description. Example: {noformat} :param stepSize: Step size for each iteration of gradient descent. (default: 0.1) :param numIterations: Number of iterations run for each batch of data. (default: 50) {noformat} h2. Current State of Parameter Description Formating h4. Classification * LogisticRegressionModel - single line descriptions, fix indentations * LogisticRegressionWithSGD - vertical alignment, sporatic default values * LogisticRegressionWithLBFGS - vertical alignment, sporatic default values * SVMModel - single line * SVMWithSGD - vertical alignment, sporatic default values * NaiveBayesModel - single line * NaiveBayes - single line h4. Clustering * KMeansModel - missing param description * KMeans - missing param description and defaults * GaussianMixture - vertical align, incorrect default formatting * PowerIterationClustering - single line with wrapped indentation, missing defaults * StreamingKMeansModel - single line wrapped * StreamingKMeans - single line wrapped, missing defaults * LDAModel - single line * LDA - vertical align, mising some defaults h4. FPM * FPGrowth - single line * PrefixSpan - single line, defaults values in backticks h4. Recommendation * ALS - does not have param descriptions h4. Regression * LabeledPoint - single line * LinearModel - single line * LinearRegressionWithSGD - vertical alignment * RidgeRegressionWithSGD - vertical align * IsotonicRegressionModel - single line * IsotonicRegression - single line, missing default h4. Tree * DecisionTree - single line with vertical indentation, missing defaults * RandomForest - single line with wrapped indent, missing some defaults * GradientBoostedTrees - single line with wrapped indent NOTE This issue will just focus on model/algorithm descriptions, which are the largest source of inconsistent formatting evaluation.py, feature.py, random.py, utils.py - these supporting classes have param descriptions as single line, but are consistent so don't need to be changed was: There are several different formats for describing params in PySpark.MLlib, making it unclear what the preferred way to document is, i.e. vertical alignment vs single line. This is to agree on a format and make it consistent across PySpark.MLlib. Following the discussion in SPARK-10560, using 2 lines with an indentation is both readable and doesn't lead to changing many lines when adding/removing parameters. If the parameter uses a default value, put this in parenthesis in a new line under the description. Example: {noformat} :param stepSize: Step size for each iteration of gradient descent. (default: 0.1) :param numIterations: Number of iterations run for each batch of data. (default: 50) {noformat} > Make Parameter Description Format Consistent in PySpark.MLlib > ------------------------------------------------------------- > > Key: SPARK-11219 > URL: https://issues.apache.org/jira/browse/SPARK-11219 > Project: Spark > Issue Type: Documentation > Components: Documentation, MLlib, PySpark > Reporter: Bryan Cutler > Priority: Trivial > > There are several different formats for describing params in PySpark.MLlib, > making it unclear what the preferred way to document is, i.e. vertical > alignment vs single line. > This is to agree on a format and make it consistent across PySpark.MLlib. > Following the discussion in SPARK-10560, using 2 lines with an indentation is > both readable and doesn't lead to changing many lines when adding/removing > parameters. If the parameter uses a default value, put this in parenthesis > in a new line under the description. > Example: > {noformat} > :param stepSize: > Step size for each iteration of gradient descent. > (default: 0.1) > :param numIterations: > Number of iterations run for each batch of data. > (default: 50) > {noformat} > h2. Current State of Parameter Description Formating > h4. Classification > * LogisticRegressionModel - single line descriptions, fix indentations > * LogisticRegressionWithSGD - vertical alignment, sporatic default values > * LogisticRegressionWithLBFGS - vertical alignment, sporatic default values > * SVMModel - single line > * SVMWithSGD - vertical alignment, sporatic default values > * NaiveBayesModel - single line > * NaiveBayes - single line > h4. Clustering > * KMeansModel - missing param description > * KMeans - missing param description and defaults > * GaussianMixture - vertical align, incorrect default formatting > * PowerIterationClustering - single line with wrapped indentation, missing > defaults > * StreamingKMeansModel - single line wrapped > * StreamingKMeans - single line wrapped, missing defaults > * LDAModel - single line > * LDA - vertical align, mising some defaults > h4. FPM > * FPGrowth - single line > * PrefixSpan - single line, defaults values in backticks > h4. Recommendation > * ALS - does not have param descriptions > h4. Regression > * LabeledPoint - single line > * LinearModel - single line > * LinearRegressionWithSGD - vertical alignment > * RidgeRegressionWithSGD - vertical align > * IsotonicRegressionModel - single line > * IsotonicRegression - single line, missing default > h4. Tree > * DecisionTree - single line with vertical indentation, missing defaults > * RandomForest - single line with wrapped indent, missing some defaults > * GradientBoostedTrees - single line with wrapped indent > NOTE > This issue will just focus on model/algorithm descriptions, which are the > largest source of inconsistent formatting > evaluation.py, feature.py, random.py, utils.py - these supporting classes > have param descriptions as single line, but are consistent so don't need to > be changed -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org