Repository: spark
Updated Branches:
refs/heads/master 56a52e0a5 -> 1c9c5de95
[SPARK-23291][SPARK-23291][R][FOLLOWUP] Update SparkR migration note for
## What changes were proposed in this pull request?
This PR fixes the migration note for SPARK-23291 since it's going to backport
to 2.3.1.
Repository: spark
Updated Branches:
refs/heads/branch-2.3 f87785a76 -> 3a22feab4
[SPARK-23291][SQL][R][BRANCH-2.3] R's substr should not reduce starting
position by 1 when calling Scala API
## What changes were proposed in this pull request?
This PR backports
Repository: spark
Updated Branches:
refs/heads/master 9c289a5cb -> d3ae3e1e8
[SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of dataframe vectorized
summarizer
## What changes were proposed in this pull request?
Make several improvements in dataframe vectorized summarizer.
1. Make the
Repository: spark
Updated Branches:
refs/heads/master 0114c89d0 -> fb0562f34
[SPARK-22810][ML][PYSPARK] Expose Python API for LinearRegression with huber
loss.
## What changes were proposed in this pull request?
Expose Python API for _LinearRegression_ with _huber_ loss.
## How was this
Repository: spark
Updated Branches:
refs/heads/master 2a29a60da -> 1e44dd004
[SPARK-3181][ML] Implement huber loss for LinearRegression.
## What changes were proposed in this pull request?
MLlib ```LinearRegression``` supports _huber_ loss addition to _leastSquares_
loss. The huber loss
Repository: spark
Updated Branches:
refs/heads/master 17cdabb88 -> b03af8b58
[SPARK-21087][ML][FOLLOWUP] Sync SharedParamsCodeGen and sharedParams.
## What changes were proposed in this pull request?
#19208 modified ```sharedParams.scala```, but didn't generated by
Repository: spark
Updated Branches:
refs/heads/branch-2.2 9e2d96d1d -> 00cdb38dc
[SPARK-22289][ML] Add JSON support for Matrix parameters (LR with coefficients
bound)
## What changes were proposed in this pull request?
jira: https://issues.apache.org/jira/browse/SPARK-22289
add JSON
Repository: spark
Updated Branches:
refs/heads/master e6dc5f280 -> 10c27a655
[SPARK-22289][ML] Add JSON support for Matrix parameters (LR with coefficients
bound)
## What changes were proposed in this pull request?
jira: https://issues.apache.org/jira/browse/SPARK-22289
add JSON
Repository: spark
Updated Branches:
refs/heads/master 7475a9655 -> 3da3d7635
[SPARK-14516][ML][FOLLOW-UP] Move ClusteringEvaluatorSuite test data to
data/mllib.
## What changes were proposed in this pull request?
Move ```ClusteringEvaluatorSuite``` test data(iris) to data/mllib, to prevent
Repository: spark
Updated Branches:
refs/heads/master fedf6961b -> 5ac96854c
[SPARK-21981][PYTHON][ML] Added Python interface for ClusteringEvaluator
## What changes were proposed in this pull request?
Added Python interface for ClusteringEvaluator
## How was this patch tested?
Manual
Repository: spark
Updated Branches:
refs/heads/master 8319432af -> 2f962422a
[MINOR][ML] Remove unnecessary default value setting for evaluators.
## What changes were proposed in this pull request?
Remove unnecessary default value setting for all evaluators, as we have set
them in
Repository: spark
Updated Branches:
refs/heads/branch-2.2 3a692e355 -> 51e5a821d
[SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest.
## What changes were proposed in this pull request?
#19197 fixed double caching for MLlib algorithms, but missed PySpark
```OneVsRest```,
Repository: spark
Updated Branches:
refs/heads/master 66cb72d7b -> c76153cc7
[SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpark OneVsRest.
## What changes were proposed in this pull request?
#19197 fixed double caching for MLlib algorithms, but missed PySpark
```OneVsRest```, this PR
Repository: spark
Updated Branches:
refs/heads/master 8d8641f12 -> 66cb72d7b
[MINOR][DOC] Add missing call of `update()` in examples of
PeriodicGraphCheckpointer & PeriodicRDDCheckpointer
## What changes were proposed in this pull request?
forgot to call `update()` with `graph1` & `rdd1` in
Repository: spark
Updated Branches:
refs/heads/master dcbb22943 -> 8d8641f12
[SPARK-21854] Added LogisticRegressionTrainingSummary for
MultinomialLogisticRegression in Python API
## What changes were proposed in this pull request?
Added LogisticRegressionTrainingSummary for
Repository: spark
Updated Branches:
refs/heads/master ca00cc70d -> 0fa5b7cac
[SPARK-21690][ML] one-pass imputer
## What changes were proposed in this pull request?
parallelize the computation of all columns
performance tests:
|numColums| Mean(Old) | Median(Old) | Mean(RDD) | Median(RDD) |
Repository: spark
Updated Branches:
refs/heads/master e2ac2f1c7 -> dd7816758
[SPARK-14516][ML] Adding ClusteringEvaluator with the implementation of Cosine
silhouette and squared Euclidean silhouette.
## What changes were proposed in this pull request?
This PR adds the ClusteringEvaluator
Repository: spark
Updated Branches:
refs/heads/master 828fab035 -> 4bab8f599
[SPARK-21856] Add probability and rawPrediction to MLPC for Python
Probability and rawPrediction has been added to MultilayerPerceptronClassifier
for Python
Add unit test.
Author: Chunsheng Ji
Repository: spark
Updated Branches:
refs/heads/master 05af2de0f -> f3676d639
[SPARK-21108][ML] convert LinearSVC to aggregator framework
## What changes were proposed in this pull request?
convert LinearSVC to new aggregator framework
## How was this patch tested?
existing unit test.
Repository: spark
Updated Branches:
refs/heads/master 3c0c2d09c -> 342961905
[ML][MINOR] Make sharedParams update.
## What changes were proposed in this pull request?
```sharedParams.scala``` was generated by ```SharedParamsCodeGen```, but it's
not updated in master. Maybe someone manual
Repository: spark
Updated Branches:
refs/heads/master 84b5b16ea -> c108a5d30
[SPARK-19762][ML][FOLLOWUP] Add necessary comments to L2Regularization.
## What changes were proposed in this pull request?
MLlib ```LinearRegression/LogisticRegression/LinearSVC``` always standardize
the data
Repository: spark
Updated Branches:
refs/heads/master 966083105 -> 07549b20a
[SPARK-19634][ML] Multivariate summarizer - dataframes API
## What changes were proposed in this pull request?
This patch adds the DataFrames API to the multivariate summarizer (mean,
variance, etc.). In addition
Repository: spark
Updated Branches:
refs/heads/branch-2.2 d02331452 -> 7446be332
[SPARK-21523][ML] update breeze to 0.13.2 for an emergency bugfix in strong
wolfe line search
## What changes were proposed in this pull request?
Update breeze to 0.13.1 for an emergency bugfix in strong wolfe
Repository: spark
Updated Branches:
refs/heads/master ae8a2b149 -> b35660dd0
[SPARK-21523][ML] update breeze to 0.13.2 for an emergency bugfix in strong
wolfe line search
## What changes were proposed in this pull request?
Update breeze to 0.13.1 for an emergency bugfix in strong wolfe line
Repository: spark
Updated Branches:
refs/heads/branch-2.0 c27a01aec -> 9f670ce5d
[SPARK-21306][ML] For branch 2.0, OneVsRest should support setWeightCol
The PR is related to #18554, and is modified for branch 2.0.
## What changes were proposed in this pull request?
add `setWeightCol` method
Repository: spark
Updated Branches:
refs/heads/branch-2.1 444cca14d -> 9b749b6ce
[SPARK-21306][ML] For branch 2.1, OneVsRest should support setWeightCol
The PR is related to #18554, and is modified for branch 2.1.
## What changes were proposed in this pull request?
add `setWeightCol` method
Repository: spark
Updated Branches:
refs/heads/master fdcee028a -> f763d8464
[SPARK-19270][FOLLOW-UP][ML] PySpark GLR model.summary should return a
printable representation.
## What changes were proposed in this pull request?
PySpark GLR ```model.summary``` should return a printable
Repository: spark
Updated Branches:
refs/heads/master 14e75758a -> 845c039ce
[SPARK-20601][ML] Python API for Constrained Logistic Regression
## What changes were proposed in this pull request?
Python API for Constrained Logistic Regression based on #17922 , thanks for the
original
Repository: spark
Updated Branches:
refs/heads/master 5fd0294ff -> 253a07e43
[SPARK-21388][ML][PYSPARK] GBTs inherit from HasStepSize & LInearSVC from
HasThreshold
## What changes were proposed in this pull request?
GBTs inherit from HasStepSize & LInearSVC/Binarizer from HasThreshold
##
Repository: spark
Updated Branches:
refs/heads/master 44e501ace -> 106eaa9b9
[SPARK-21575][SPARKR] Eliminate needless synchronization in java-R serialization
## What changes were proposed in this pull request?
Remove surplus synchronized blocks.
## How was this patch tested?
Unit tests run
Repository: spark
Updated Branches:
refs/heads/branch-2.1 8520d7c6d -> 258ca40cf
Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"
This reverts commit 8520d7c6d5e880dea3c1a8a874148c07222b4b4b.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/branch-2.0 ccb827224 -> f8ae2bdd2
Revert "[SPARK-21306][ML] OneVsRest should support setWeightCol"
This reverts commit ccb82722450c20c9cdea2b2c68783943213a5aa1.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/branch-2.0 d7b9d6235 -> ccb827224
[SPARK-21306][ML] OneVsRest should support setWeightCol
## What changes were proposed in this pull request?
add `setWeightCol` method for OneVsRest.
`weightCol` is ignored if classifier doesn't inherit
Repository: spark
Updated Branches:
refs/heads/branch-2.1 94987987a -> 8520d7c6d
[SPARK-21306][ML] OneVsRest should support setWeightCol
## What changes were proposed in this pull request?
add `setWeightCol` method for OneVsRest.
`weightCol` is ignored if classifier doesn't inherit
Repository: spark
Updated Branches:
refs/heads/master f44ead89f -> a5a318997
[SPARK-21306][ML] OneVsRest should support setWeightCol
## What changes were proposed in this pull request?
add `setWeightCol` method for OneVsRest.
`weightCol` is ignored if classifier doesn't inherit HasWeightCol
Repository: spark
Updated Branches:
refs/heads/master 2ff35a057 -> ddcd2e826
[SPARK-19270][ML] Add summary table to GLM summary
## What changes were proposed in this pull request?
Add R-like summary table to GLM summary, which includes feature name (if
exist), parameter estimate, standard
Repository: spark
Updated Branches:
refs/heads/master 256358f66 -> 5d1850d4b
[MINOR][ML] Reorg RFormula params.
## What changes were proposed in this pull request?
There are mainly two reasons for this reorg:
* Some params are placed in ```RFormulaBase```, while others are placed in
Repository: spark
Updated Branches:
refs/heads/master 74ac1fb08 -> 69e5282d3
[SPARK-20307][ML][SPARKR][FOLLOW-UP] RFormula should handle invalid for both
features and label column.
## What changes were proposed in this pull request?
```RFormula``` should handle invalid for both features and
Repository: spark
Updated Branches:
refs/heads/master aaad34dc2 -> d2d2a5de1
[SPARK-18619][ML] Make QuantileDiscretizer/Bucketizer/StringIndexer/RFormula
inherit from HasHandleInvalid
## What changes were proposed in this pull request?
1, HasHandleInvaild support override
2, Make
Repository: spark
Updated Branches:
refs/heads/master 7fcbb9b57 -> 56536e999
[SPARK-21285][ML] VectorAssembler reports the column name of unsupported data
type
## What changes were proposed in this pull request?
add the column name in the exception which is raised by unsupported data type.
Repository: spark
Updated Branches:
refs/heads/master a38643256 -> 4852b7d44
[SPARK-21310][ML][PYSPARK] Expose offset in PySpark
## What changes were proposed in this pull request?
Add offset to PySpark in GLM as in #16699.
## How was this patch tested?
Python test
Author: actuaryzhang
Repository: spark
Updated Branches:
refs/heads/master c605fee01 -> c19680be1
[SPARK-19852][PYSPARK][ML] Python StringIndexer supports 'keep' to handle
invalid data
## What changes were proposed in this pull request?
This PR is to maintain API parity with changes made in SPARK-17498 to
Repository: spark
Updated Branches:
refs/heads/master 37ef32e51 -> e0b047eaf
[SPARK-18518][ML] HasSolver supports override
## What changes were proposed in this pull request?
1, make param support non-final with `finalFields` option
2, generate `HasSolver` with `finalFields = false`
3,
Repository: spark
Updated Branches:
refs/heads/master b1d719e7c -> 37ef32e51
[SPARK-21275][ML] Update GLM test to use supportedFamilyNames
## What changes were proposed in this pull request?
Update GLM test to use supportedFamilyNames as suggested here:
Repository: spark
Updated Branches:
refs/heads/master 3c2fc19d4 -> 528c9281a
[ML] Fix scala-2.10 build failure of GeneralizedLinearRegressionSuite.
## What changes were proposed in this pull request?
Fix scala-2.10 build failure of ```GeneralizedLinearRegressionSuite```.
## How was this
Repository: spark
Updated Branches:
refs/heads/master 52981715b -> 49d767d83
[SPARK-18710][ML] Add offset in GLM
## What changes were proposed in this pull request?
Add support for offset in GLM. This is useful for at least two reasons:
1. Account for exposure: e.g., when modeling the number
Repository: spark
Updated Branches:
refs/heads/master 376d90d55 -> 0c8444cf6
[SPARK-14657][SPARKR][ML] RFormula w/o intercept should output reference
category when encoding string terms
## What changes were proposed in this pull request?
Please see
Repository: spark
Updated Branches:
refs/heads/master 35b644bd0 -> ff5676b01
[SPARK-20899][PYSPARK] PySpark supports stringIndexerOrderType in RFormula
## What changes were proposed in this pull request?
PySpark supports stringIndexerOrderType in RFormula as in #17967.
## How was this patch
Repository: spark
Updated Branches:
refs/heads/master 2dbe0c528 -> f47700c9c
[SPARK-14659][ML] RFormula consistent with R when handling strings
## What changes were proposed in this pull request?
When handling strings, the category dropped by RFormula and R are different:
- RFormula drops the
Repository: spark
Updated Branches:
refs/heads/branch-2.2 9cbf39f1c -> e01f1f222
[SPARK-20768][PYSPARK][ML] Expose numPartitions (expert) param of PySpark
FPGrowth.
## What changes were proposed in this pull request?
Expose numPartitions (expert) param of PySpark FPGrowth.
## How was this
Repository: spark
Updated Branches:
refs/heads/master 913a6bfe4 -> 139da116f
[SPARK-20768][PYSPARK][ML] Expose numPartitions (expert) param of PySpark
FPGrowth.
## What changes were proposed in this pull request?
Expose numPartitions (expert) param of PySpark FPGrowth.
## How was this
Repository: spark
Updated Branches:
refs/heads/branch-2.2 8896c4ee9 -> 9cbf39f1c
[SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrowth.
## What changes were proposed in this pull request?
Follow-up for #17218, some minor fix for PySpark ```FPGrowth```.
## How was this patch tested?
Repository: spark
Updated Branches:
refs/heads/master 3f94e64aa -> 913a6bfe4
[SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrowth.
## What changes were proposed in this pull request?
Follow-up for #17218, some minor fix for PySpark ```FPGrowth```.
## How was this patch tested?
Existing
Repository: spark
Updated Branches:
refs/heads/branch-2.0 4dd34d004 -> 72e1f83d7
[SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndarray.reshape in
LogisticRegressionModel
## What changes were proposed in this pull request?
Fixed TypeError with python3 and numpy 1.12.1. Numpy's
Repository: spark
Updated Branches:
refs/heads/branch-2.1 f4538c95f -> 13adc0fc0
[SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndarray.reshape in
LogisticRegressionModel
## What changes were proposed in this pull request?
Fixed TypeError with python3 and numpy 1.12.1. Numpy's
Repository: spark
Updated Branches:
refs/heads/branch-2.2 1d107242f -> 83aeac9e0
[SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndarray.reshape in
LogisticRegressionModel
## What changes were proposed in this pull request?
Fixed TypeError with python3 and numpy 1.12.1. Numpy's
Repository: spark
Updated Branches:
refs/heads/master 1816eb3be -> bc66a77bb
[SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndarray.reshape in
LogisticRegressionModel
## What changes were proposed in this pull request?
Fixed TypeError with python3 and numpy 1.12.1. Numpy's `reshape` no
Repository: spark
Updated Branches:
refs/heads/branch-2.2 e936a96ba -> 1d107242f
[SPARK-20631][FOLLOW-UP] Fix incorrect tests.
## What changes were proposed in this pull request?
- Fix incorrect tests for `_check_thresholds`.
- Move test to `ParamTests`.
## How was this patch tested?
Unit
Repository: spark
Updated Branches:
refs/heads/master 9afcf127d -> 1816eb3be
[SPARK-20631][FOLLOW-UP] Fix incorrect tests.
## What changes were proposed in this pull request?
- Fix incorrect tests for `_check_thresholds`.
- Move test to `ParamTests`.
## How was this patch tested?
Unit
Repository: spark
Updated Branches:
refs/heads/branch-2.2 ee9d5975e -> e936a96ba
[SPARK-20764][ML][PYSPARK][FOLLOWUP] Fix visibility discrepancy with
numInstances and degreesOfFreedom in LR and GLR - Python version
## What changes were proposed in this pull request?
Add test cases for
Repository: spark
Updated Branches:
refs/heads/master d76633e3c -> 9afcf127d
[SPARK-20764][ML][PYSPARK][FOLLOWUP] Fix visibility discrepancy with
numInstances and degreesOfFreedom in LR and GLR - Python version
## What changes were proposed in this pull request?
Add test cases for PR-18062
Repository: spark
Updated Branches:
refs/heads/master 442287ae2 -> ad09e4ca0
[MINOR][SPARKR][ML] Joint coefficients with intercept for SparkR linear SVM
summary.
## What changes were proposed in this pull request?
Joint coefficients with intercept for SparkR linear SVM summary.
## How was
Repository: spark
Updated Branches:
refs/heads/branch-2.2 06c985c1b -> dbb068f4f
[MINOR][SPARKR][ML] Joint coefficients with intercept for SparkR linear SVM
summary.
## What changes were proposed in this pull request?
Joint coefficients with intercept for SparkR linear SVM summary.
## How
Repository: spark
Updated Branches:
refs/heads/branch-2.2 a57553279 -> a0bf5c47c
[SPARK-20764][ML][PYSPARK] Fix visibility discrepancy with numInstances and
degreesOfFreedom in LR and GLR - Python version
## What changes were proposed in this pull request?
SPARK-20097 exposed
Repository: spark
Updated Branches:
refs/heads/master f3ed62a38 -> cfca01136
[SPARK-20764][ML][PYSPARK] Fix visibility discrepancy with numInstances and
degreesOfFreedom in LR and GLR - Python version
## What changes were proposed in this pull request?
SPARK-20097 exposed degreesOfFreedom
Repository: spark
Updated Branches:
refs/heads/branch-2.2 b8fa79cec -> ba0117c27
[SPARK-20505][ML] Add docs and examples for ml.stat.Correlation and
ml.stat.ChiSquareTest.
## What changes were proposed in this pull request?
Add docs and examples for ```ml.stat.Correlation``` and
Repository: spark
Updated Branches:
refs/heads/master 324a904d8 -> 697a5e551
[SPARK-20505][ML] Add docs and examples for ml.stat.Correlation and
ml.stat.ChiSquareTest.
## What changes were proposed in this pull request?
Add docs and examples for ```ml.stat.Correlation``` and
Repository: spark
Updated Branches:
refs/heads/branch-2.2 10e599f69 -> a869e8bfd
[SPARK-20707][ML] ML deprecated APIs should be removed in major release.
## What changes were proposed in this pull request?
Before 2.2, MLlib keep to remove APIs deprecated in last feature/minor release.
But
Repository: spark
Updated Branches:
refs/heads/master b0888d1ac -> 9970aa096
[SPARK-20669][ML] LoR.family and LDA.optimizer should be case insensitive
## What changes were proposed in this pull request?
make param `family` in LoR and `optimizer` in LDA case insensitive
## How was this patch
Repository: spark
Updated Branches:
refs/heads/branch-2.2 3eb0ee06a -> 80a57fa90
[SPARK-20606][ML] Revert "[] ML 2.2 QA: Remove deprecated methods for ML"
This reverts commit b8733e0ad9f5a700f385e210450fd2c10137293e.
Author: Yanbo Liang
Closes #17944 from
Repository: spark
Updated Branches:
refs/heads/master 8ddbc431d -> 0698e6c88
[SPARK-20606][ML] Revert "[] ML 2.2 QA: Remove deprecated methods for ML"
This reverts commit b8733e0ad9f5a700f385e210450fd2c10137293e.
Author: Yanbo Liang
Closes #17944 from
Repository: spark
Updated Branches:
refs/heads/branch-2.0 46659974e -> d86dae8fe
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should
use values not Params
## What changes were proposed in this pull request?
- Replace `getParam` calls with `getOrDefault` calls.
-
Repository: spark
Updated Branches:
refs/heads/master 0ef16bd4b -> 804949c6b
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should
use values not Params
## What changes were proposed in this pull request?
- Replace `getParam` calls with `getOrDefault` calls.
- Fix
Repository: spark
Updated Branches:
refs/heads/branch-2.1 8e097890a -> 69786ea3a
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should
use values not Params
## What changes were proposed in this pull request?
- Replace `getParam` calls with `getOrDefault` calls.
-
Repository: spark
Updated Branches:
refs/heads/branch-2.2 ef50a9548 -> 3ed2f4d51
[SPARK-20631][PYTHON][ML] LogisticRegression._checkThresholdConsistency should
use values not Params
## What changes were proposed in this pull request?
- Replace `getParam` calls with `getOrDefault` calls.
-
Repository: spark
Updated Branches:
refs/heads/branch-2.2 4bbfad44e -> 4b7aa0b1d
[SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML
## What changes were proposed in this pull request?
Remove ML methods we deprecated in 2.1.
## How was this patch tested?
Existing tests.
Author:
Repository: spark
Updated Branches:
refs/heads/master be53a7835 -> b8733e0ad
[SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML
## What changes were proposed in this pull request?
Remove ML methods we deprecated in 2.1.
## How was this patch tested?
Existing tests.
Author: Yanbo
Repository: spark
Updated Branches:
refs/heads/master bfc8c79c8 -> 0d16faab9
[SPARK-20574][ML] Allow Bucketizer to handle non-Double numeric column
## What changes were proposed in this pull request?
Bucketizer currently requires input column to be Double, but the logic should
work on any
Repository: spark
Updated Branches:
refs/heads/branch-2.2 425ed26d2 -> c8756288d
[SPARK-20574][ML] Allow Bucketizer to handle non-Double numeric column
## What changes were proposed in this pull request?
Bucketizer currently requires input column to be Double, but the logic should
work on
Repository: spark
Updated Branches:
refs/heads/branch-2.2 b6727795f -> 425ed26d2
[SPARK-20047][FOLLOWUP][ML] Constrained Logistic Regression follow up
## What changes were proposed in this pull request?
Address some minor comments for #17715:
* Put bound-constrained optimization params under
Repository: spark
Updated Branches:
refs/heads/master 57b64703e -> c5dceb8c6
[SPARK-20047][FOLLOWUP][ML] Constrained Logistic Regression follow up
## What changes were proposed in this pull request?
Address some minor comments for #17715:
* Put bound-constrained optimization params under
Repository: spark
Updated Branches:
refs/heads/branch-2.2 612952251 -> 34dec68d7
[MINOR][ML] Fix some PySpark & SparkR flaky tests
## What changes were proposed in this pull request?
Some PySpark & SparkR tests run with tiny dataset and tiny ```maxIter```, which
means they are not converged.
Repository: spark
Updated Branches:
refs/heads/master 7fecf5130 -> dbb06c689
[MINOR][ML] Fix some PySpark & SparkR flaky tests
## What changes were proposed in this pull request?
Some PySpark & SparkR tests run with tiny dataset and tiny ```maxIter```, which
means they are not converged. I
Repository: spark
Updated Branches:
refs/heads/branch-2.2 b62ebd91b -> e2591c6d7
[SPARK-18901][FOLLOWUP][ML] Require in LR LogisticAggregator is redundant
## What changes were proposed in this pull request?
This is a follow-up PR of #17478.
## How was this patch tested?
Existing tests
Repository: spark
Updated Branches:
refs/heads/master 0bc7a9021 -> 387565cf1
[SPARK-18901][FOLLOWUP][ML] Require in LR LogisticAggregator is redundant
## What changes were proposed in this pull request?
This is a follow-up PR of #17478.
## How was this patch tested?
Existing tests
Author:
Repository: spark
Updated Branches:
refs/heads/branch-2.2 2bef01f64 -> cf16c3250
[SPARK-18901][ML] Require in LR LogisticAggregator is redundant
## What changes were proposed in this pull request?
In MultivariateOnlineSummarizer,
`add` and `merge` have check for weights and feature sizes.
Repository: spark
Updated Branches:
refs/heads/master 776a2c0e9 -> 90264aced
[SPARK-18901][ML] Require in LR LogisticAggregator is redundant
## What changes were proposed in this pull request?
In MultivariateOnlineSummarizer,
`add` and `merge` have check for weights and feature sizes. The
Repository: spark
Updated Branches:
refs/heads/master 3fada2f50 -> 1d00761b9
[MINOR][SPARKR] Move 'Data type mapping between R and Spark' to right place in
SparkR doc.
Section ```Data type mapping between R and Spark``` was put in the wrong place
in SparkR doc currently, we should move it
Repository: spark
Updated Branches:
refs/heads/branch-2.1 c4d2b8338 -> 277ed375b
[SPARK-19925][SPARKR] Fix SparkR spark.getSparkFiles fails when it was called
on executors.
## What changes were proposed in this pull request?
SparkR ```spark.getSparkFiles``` fails when it was called on
Repository: spark
Updated Branches:
refs/heads/master c1e87e384 -> 478fbc866
[SPARK-19925][SPARKR] Fix SparkR spark.getSparkFiles fails when it was called
on executors.
## What changes were proposed in this pull request?
SparkR ```spark.getSparkFiles``` fails when it was called on executors,
Repository: spark
Updated Branches:
refs/heads/master 1fa58868b -> 81303f7ca
[SPARK-19806][ML][PYSPARK] PySpark GeneralizedLinearRegression supports tweedie
distribution.
## What changes were proposed in this pull request?
PySpark ```GeneralizedLinearRegression``` supports tweedie
Repository: spark
Updated Branches:
refs/heads/master 8417a7ae6 -> 93ae176e8
[SPARK-19745][ML] SVCAggregator captures coefficients in its closure
## What changes were proposed in this pull request?
JIRA: [SPARK-19745](https://issues.apache.org/jira/browse/SPARK-19745)
Reorganize
Repository: spark
Updated Branches:
refs/heads/master 3bd8ddf7c -> d2a879762
[SPARK-19734][PYTHON][ML] Correct OneHotEncoder doc string to say dropLast
## What changes were proposed in this pull request?
Updates the doc string to match up with the code
i.e. say dropLast instead of
Repository: spark
Updated Branches:
refs/heads/master de2b53df4 -> 3bd8ddf7c
[MINOR][ML] Fix comments in LSH Examples and Python API
## What changes were proposed in this pull request?
Remove `org.apache.spark.examples.` in
Add slash in one of the python doc.
## How was this patch tested?
Repository: spark
Updated Branches:
refs/heads/master 410392ed7 -> 6ab60542e
[MINOR][ML][DOC] Document default value for
GeneralizedLinearRegression.linkPower
Add Scaladoc for GeneralizedLinearRegression.linkPower default value
Follow-up to https://github.com/apache/spark/pull/16344
Repository: spark
Updated Branches:
refs/heads/master 1a3f5f8c5 -> b40659838
[SPARK-18285][SPARKR] SparkR approxQuantile supports input multiple columns
## What changes were proposed in this pull request?
SparkR ```approxQuantile``` supports input multiple columns.
## How was this patch
Repository: spark
Updated Branches:
refs/heads/master 21b4ba2d6 -> 08c1972a0
[SPARK-18080][ML][PYTHON] Python API & Examples for Locality Sensitive Hashing
## What changes were proposed in this pull request?
This pull request includes python API and examples for LSH. The API changes was
Repository: spark
Updated Branches:
refs/heads/master 90817a6cd -> 4172ff80d
[SPARK-18929][ML] Add Tweedie distribution in GLM
## What changes were proposed in this pull request?
I propose to add the full Tweedie family into the GeneralizedLinearRegression
model. The Tweedie family is
Repository: spark
Updated Branches:
refs/heads/master 76db394f2 -> 0e821ec6f
[SPARK-19313][ML][MLLIB] GaussianMixture should limit the number of features
## What changes were proposed in this pull request?
The following test will fail on current master
scala
test("gmm fails on high
Repository: spark
Updated Branches:
refs/heads/branch-2.1 8daf10e3f -> 1e07a7192
[SPARK-19155][ML] Make family case insensitive in GLM
## What changes were proposed in this pull request?
This is a supplement to PR #16516 which did not make the value from `getFamily`
case insensitive. Current
1 - 100 of 227 matches
Mail list logo