On 2020-05-08 20:02, joseph pareti wrote:
In general I prefer doing:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
random_state=42)
>clf = RandomForestClassifier(n_estimators = 100, max_depth=
None) *clf_f = clf.fit(X_train, y_train)* predicted_labels = clf_f.predict(
X_test) score = clf.score(X_test, y_test) score1 = metrics.accuracy_score(
y_test, predicted_labels)
rather than:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
random_state=42) clf0=RandomForestClassifier(n_estimators=100, max_depth=
None) *clf0.fit(X_train, y_train)* y_pred =clf0.predict(X_test) score=
metrics.accuracy_score(y_test, y_pred)
Are the two codes really equivalent?
You didn't give any context and say what package you're using!
After searching for "RandomForestClassifier", I'm guessing that you're
using scikit.
From the documentation here:
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier.fit
it says:
Returns: self : object
so it looks like clf.fit(...) returns clf.
That being the case, then, yes, they're equivalent.
--
https://mail.python.org/mailman/listinfo/python-list