Repository: incubator-systemml Updated Branches: refs/heads/gh-pages bb97a4bc6 -> 5c4e27c70
[SYSTEMML-1238] Updated the default parameters of mllearn to match that of scikit learn. - Also updated the test to compare our algorithm to scikit-learn. Closes #398. Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/0fb74b94 Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/0fb74b94 Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/0fb74b94 Branch: refs/heads/gh-pages Commit: 0fb74b94af9e244b5695745ac7b3651b485b812f Parents: bb97a4b Author: Niketan Pansare <[email protected]> Authored: Fri Feb 17 14:54:23 2017 -0800 Committer: Niketan Pansare <[email protected]> Committed: Fri Feb 17 14:59:49 2017 -0800 ---------------------------------------------------------------------- algorithms-regression.md | 8 ++++---- beginners-guide-python.md | 2 +- python-reference.md | 6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/0fb74b94/algorithms-regression.md ---------------------------------------------------------------------- diff --git a/algorithms-regression.md b/algorithms-regression.md index 992862e..80b38a3 100644 --- a/algorithms-regression.md +++ b/algorithms-regression.md @@ -83,8 +83,8 @@ efficient when the number of features $m$ is relatively small <div data-lang="Python" markdown="1"> {% highlight python %} from systemml.mllearn import LinearRegression -# C = 1/reg -lr = LinearRegression(sqlCtx, fit_intercept=True, C=1.0, solver='direct-solve') +# C = 1/reg (to disable regularization, use float("inf")) +lr = LinearRegression(sqlCtx, fit_intercept=True, normalize=False, C=float("inf"), solver='direct-solve') # X_train, y_train and X_test can be NumPy matrices or Pandas DataFrame or SciPy Sparse Matrix y_test = lr.fit(X_train, y_train) # df_train is DataFrame that contains two columns: "features" (of type Vector) and "label". df_test is a DataFrame that contains the column "features" @@ -125,8 +125,8 @@ y_test = lr.fit(df_train) <div data-lang="Python" markdown="1"> {% highlight python %} from systemml.mllearn import LinearRegression -# C = 1/reg -lr = LinearRegression(sqlCtx, fit_intercept=True, max_iter=100, tol=0.000001, C=1.0, solver='newton-cg') +# C = 1/reg (to disable regularization, use float("inf")) +lr = LinearRegression(sqlCtx, fit_intercept=True, normalize=False, max_iter=100, tol=0.000001, C=float("inf"), solver='newton-cg') # X_train, y_train and X_test can be NumPy matrices or Pandas DataFrames or SciPy Sparse matrices y_test = lr.fit(X_train, y_train) # df_train is DataFrame that contains two columns: "features" (of type Vector) and "label". df_test is a DataFrame that contains the column "features" http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/0fb74b94/beginners-guide-python.md ---------------------------------------------------------------------- diff --git a/beginners-guide-python.md b/beginners-guide-python.md index 4d1b098..ffab09e 100644 --- a/beginners-guide-python.md +++ b/beginners-guide-python.md @@ -228,7 +228,7 @@ X_test = diabetes_X[-20:] y_train = diabetes.target[:-20] y_test = diabetes.target[-20:] # Create linear regression object -regr = LinearRegression(sqlCtx, fit_intercept=True, C=1, solver='direct-solve') +regr = LinearRegression(sqlCtx, fit_intercept=True, C=float("inf"), solver='direct-solve') # Train the model using the training sets regr.fit(X_train, y_train) y_predicted = regr.predict(X_test) http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/0fb74b94/python-reference.md ---------------------------------------------------------------------- diff --git a/python-reference.md b/python-reference.md index 65dcb5c..8d38598 100644 --- a/python-reference.md +++ b/python-reference.md @@ -731,7 +731,7 @@ LogisticRegression score: 0.922222 ### Reference documentation - *class*`systemml.mllearn.estimators.LinearRegression`(*sqlCtx*, *fit\_intercept=True*, *max\_iter=100*, *tol=1e-06*, *C=1.0*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LinearRegression "Permalink to this definition") + *class*`systemml.mllearn.estimators.LinearRegression`(*sqlCtx*, *fit\_intercept=True*, *normalize=False*, *max\_iter=100*, *tol=1e-06*, *C=float("inf")*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LinearRegression "Permalink to this definition") : Bases: `systemml.mllearn.estimators.BaseSystemMLRegressor`{.xref .py .py-class .docutils .literal} @@ -760,7 +760,7 @@ LogisticRegression score: 0.922222 >>> # The mean square error >>> print("Residual sum of squares: %.2f" % np.mean((regr.predict(diabetes_X_test) - diabetes_y_test) ** 2)) - *class*`systemml.mllearn.estimators.LogisticRegression`(*sqlCtx*, *penalty='l2'*, *fit\_intercept=True*, *max\_iter=100*, *max\_inner\_iter=0*, *tol=1e-06*, *C=1.0*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LogisticRegression "Permalink to this definition") + *class*`systemml.mllearn.estimators.LogisticRegression`(*sqlCtx*, *penalty='l2'*, *fit\_intercept=True*, *normalize=False*, *max\_iter=100*, *max\_inner\_iter=0*, *tol=1e-06*, *C=1.0*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LogisticRegression "Permalink to this definition") : Bases: `systemml.mllearn.estimators.BaseSystemMLClassifier`{.xref .py .py-class .docutils .literal} @@ -817,7 +817,7 @@ LogisticRegression score: 0.922222 >>> prediction = model.transform(test) >>> prediction.show() - *class*`systemml.mllearn.estimators.SVM`(*sqlCtx*, *fit\_intercept=True*, *max\_iter=100*, *tol=1e-06*, *C=1.0*, *is\_multi\_class=False*, *transferUsingDF=False*)(#systemml.mllearn.estimators.SVM "Permalink to this definition") + *class*`systemml.mllearn.estimators.SVM`(*sqlCtx*, *fit\_intercept=True*, *normalize=False*, *max\_iter=100*, *tol=1e-06*, *C=1.0*, *is\_multi\_class=False*, *transferUsingDF=False*)(#systemml.mllearn.estimators.SVM "Permalink to this definition") : Bases: `systemml.mllearn.estimators.BaseSystemMLClassifier`{.xref .py .py-class .docutils .literal}
