Hello Scikit-learn community,

I hope you all are doing well!

I am currently working with BatchIncrementalClassifier from Scikit-multiflow package. For this BatchIncrementalClassifier, the following example is given:

# Setup a data stream
stream = SEAGenerator(random_state=1)

# Pre-training the classifier with 200 samples
X, y = stream.next_sample(200)
batch_incremental_cfier = BatchIncrementalClassifier()
batch_incremental_cfier.partial_fit(X, y)

# Preparing the processing of 5000 samples and correct prediction count
n_samples = 0
correct_cnt = 0
while n_samples < 5000 and stream.has_more_samples():
    X, y = stream.next_sample()
    y_pred = batch_incremental_cfier.predict(X)
    if y[0] == y_pred[0]:
        correct_cnt += 1
    batch_incremental_cfier.partial_fit(X, y)
    n_samples += 1
# Display results
print('Batch Incremental ensemble classifier example')
print('{} samples analyzed'.format(n_samples))
print('Performance: {}'.format(correct_cnt / n_samples))


Now my questions are:

1. For pre-training the model, the classifier used 200 samples from the stream, and then it does the prequential evaluation (test-train) on 5000 samples. So, the 200 samples, are they considered as the 1st batch of data from the stream that is just used for pre-training and when the 2nd batch of data (5000) becomes available it does the evaluation based on the pre-train model??? (This makes sense to me, as in this way, we will have influence from the previous pre-trained model)

or

2. Is this one batch (200+5000) from the stream where 1st 200 samples have been used to pre-train and the rest of the samples are used for evaluation?? And when the next batch will arrive from the stream, will it does the same thing (200 for pre-training and the rest of them for evaluation)?? (If this is the case, are not we training from the scratch each time which does not keep the BatchIncrementalClassifier as an incremental classifier anymore?)


Thanks!

--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to