Hi Mamun, The new samples generated through SMOTE are synthetically created. You can refer to the paper of Chawla for more information. Therefore, there is no indexes linked to the original data. However, while under-sampling you can get this information setting up the `return_indices=True`.
Cheers, On 12 December 2016 at 13:56, Mamun Rashid <[email protected]> wrote: > Hi All, > Not sure if questions regarding the contributory packages are answered > here. Just trying my luck. > > I am have a seriously imbalanced classification problem. I am trying to > use SMOTE+ENN oversampling and undersampling method to oversample my > minority class and oversample my majority class. > > ======== > > from sklearn.datasets import make_classification > from imblearn.combine import SMOTEENN > > sm = SMOTEENN() > X, y = make_classification(n_classes=2, class_sep=2, weights=[0.2, 0.8], > n_informative=1, n_redundant=1, flip_y=0, n_features=3, > n_clusters_per_class=1, n_samples=50, random_state=10) > X_df = pd.DataFrame(X) > X_resampled, y_resampled = sm.fit_sample(X_df, y) > > ========= > > I understand that SMOTE returns a resampled data matrix i.e. X_resampled. I > was wondering if there is a direct way to retrieve the indexes of the > original data observations ? > > Thanks in advance. > > Best Regards and Seasons Greetings., > Mamun > > _______________________________________________ > scikit-learn mailing list > [email protected] > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Guillaume Lemaitre INRIA Saclay - Ile-de-France Equipe PARIETAL [email protected] <[email protected]>r --- https://glemaitre.github.io/
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
