Re: [scikit-learn] Questions about partial_fit and the Incremental library in Sci-kit learn

2019-09-09 Thread Farzana Anowar

On 2019-09-09 12:12, Daniel Sullivan wrote:

Hi Farzana,

If I understand your question correctly you're asking how the SGD
classifier works incrementally? The SGD algorithm maintains a single
set of weights and iterates through all data points one at a time in a
batch. It adjusts its weights on each iteration. So to answer your
question, it trains on each instance, not on the batch. However, the
algorithm can iterate multiple times through a single batch. Let me
know if that answers your question.

Best,

Danny

On Mon, Sep 9, 2019 at 11:56 AM Farzana Anowar 
wrote:


Hello Sir/Madam,

I subscribed to the link you sent me.

I am posting my question again:

This Is Farzana Anowar, a Ph.D. candidate in University of Regina.
Currently, I'm working to develop a model that learns incrementally
from
non-stationary data. I have come across an Incremental library in
sci-kit learn that actually allows to do that using partial_fit. I
have
searched a lot for the detailed information about this 'incremental'

library and 'partial_fit', however, I couldn't find any.

It would be great if you could provide me with some detailed
information
about these two regarding how they actually work. For example, If we

take SGD as a classifier, the incremental library will allow me to
take
chunks/batches of data. My question is: Do this incremental library
train (using parial_fit) the whole batch at a time and then produce
a
classification performance or it takes a batch and trains each
instance
at a time from the batch.

Thanks in advance!

--
Regards,

Farzana Anowar
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Hello Daniel,

Thank you so much! I think your clarification makes sense. So, whatever 
batches I am passing through the classifier it will train each instance 
through a single batch.


I was just wondering if you could give me some information about 
partial_fit. Just for your reference, I was having a look at this code.


https://dask-ml.readthedocs.io/en/latest/incremental.html

Thanks!

--
Regards,

Farzana Anowar
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Questions about partial_fit and the Incremental library in Sci-kit learn

2019-09-09 Thread Farzana Anowar

Hello Sir/Madam,

I subscribed to the link you sent me.


I am posting my question again:

This Is Farzana Anowar, a Ph.D. candidate in University of Regina. 
Currently, I'm working to develop a model that learns incrementally from 
non-stationary data. I have come across an Incremental library in 
sci-kit learn that actually allows to do that using partial_fit. I have 
searched a lot for the detailed information about this 'incremental' 
library and 'partial_fit', however, I couldn't find any.


It would be great if you could provide me with some detailed information 
about these two regarding how they actually work. For example, If we 
take SGD as a classifier, the incremental library will allow me to take 
chunks/batches of data. My question is: Do this incremental library 
train (using parial_fit) the whole batch at a time and then produce a 
classification performance or it takes a batch and trains each instance 
at a time from the batch.


Thanks in advance!

--
Regards,

Farzana Anowar
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Incremental learning in scikit-learn

2019-09-09 Thread Farzana Anowar

Hello Sir/Madam,

I am going through the incremental learning algorithm in Scikit-learn. 
SGD in sci-kit learn is such a kind of algorithm that allows learning 
incrementally by passing chunks/batches. Now my question is: does 
sci-kit learn keeps all the batches for training data in memory? Or it 
keeps chunks/batches in memory up to a certain amount of size? Or it 
keeps only one chunk/batch while training in memory and removes the 
other trained chunks/batches after training? Does that mean it suffers 
from catastrophic forgetting?


Thanks!

--
Regards,

Farzana Anowar
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Incremental learning in scikit-learn

2019-09-09 Thread Farzana Anowar

On 2019-09-09 17:53, Daniel Sullivan wrote:

Hey Farzana,

The algorithm only keeps one batch in memory at a time. Between
processing over each batch, SGD keeps a set of weights that it alters
with each iteration of a data point or instance within a batch. This
set of weights functions as the persisted state between calls of
partial_fit. That means you will get the same results with SGD
regardless of your batch size and you can choose your batch size
according to your memory constraints. Hope that helps.

- Danny

On Mon, Sep 9, 2019 at 5:53 PM Farzana Anowar 
wrote:


Hello Sir/Madam,

I am going through the incremental learning algorithm in
Scikit-learn.
SGD in sci-kit learn is such a kind of algorithm that allows
learning
incrementally by passing chunks/batches. Now my question is: does
sci-kit learn keeps all the batches for training data in memory? Or
it
keeps chunks/batches in memory up to a certain amount of size? Or it

keeps only one chunk/batch while training in memory and removes the
other trained chunks/batches after training? Does that mean it
suffers
from catastrophic forgetting?

Thanks!

--
Regards,

Farzana Anowar
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Thanks a lot!
--
Regards,

Farzana Anowar
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Attribute Incremental learning

2020-01-16 Thread Farzana Anowar

On 2020-01-16 08:36, Max Halford wrote:

Hello Farzana,

You might want to check out scikit-multiflow [1] and creme [2] (I'm
the author).

Kind regards.

On Tue, 14 Jan 2020 at 16:59, Farzana Anowar 
wrote:


Hello,

This is Farzana. I am trying to understand the attribute incremental

learning ( or virtual concept drift) which is every time when a new
feature will be available for a real-time dataset (i.e. any online
auction dataset) a classifier will add that new feature with the
existing features in a dataset and classify the new dataset (with
previous features and new features) incrementally. I know that we
can
convert a static classifier to an incremental classifier in
scikit-learn. However, I could not find any library or function for
attribute incremental learning or any detail information. It would
be
great if anyone could give me some insight on this.

Thanks!
--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


--

Max Halford

+336 28 25 13 38

Links:
--
[1] https://scikit-multiflow.github.io/
[2] https://creme-ml.github.io/
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Hello Max,

Thanks a lot.
--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Attribute Incremental learning

2020-01-14 Thread Farzana Anowar

Hello,

This is Farzana. I am trying to understand the attribute incremental 
learning ( or virtual concept drift) which is every time when a new 
feature will be available for a real-time dataset (i.e. any online 
auction dataset) a classifier will add that new feature with the 
existing features in a dataset and classify the new dataset (with 
previous features and new features) incrementally. I know that we can 
convert a static classifier to an incremental classifier in 
scikit-learn. However, I could not find any library or function for 
attribute incremental learning or any detail information. It would be 
great if anyone could give me some insight on this.


Thanks!
--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] transfer learning doubt

2020-03-19 Thread Farzana Anowar

On 2020-03-19 00:11, Praneet Singh wrote:

I am training a SGD Classifier with some training dataset which is
temporary and will be lost after sometime. So I am planning to save
the model in pickle file and reuse it and train again with some
another dataset that arrives. But It forgets the previously learned
data.

As far as I researched in google, tensorflow model allows transfer
learning and not forgetting the previous learning but is there any
other way with sklearn model to achieve this??
any help would be appreciated
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
Did you use incremental estimator and partial _fit? If not, try to use 
them. Should work.


Another option is to us deep learning and store the weights for the 
first model and initialize the second model with that weight and keep 
doing it for the rest of the models.

--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Incremental learning in scikit-learn

2020-11-15 Thread Farzana Anowar

Hello everyone,

Currently, I am working with incremental learning. I know that 
scikit-learn allows using incremental learning for some classifiers i. 
e. SGD. In incremental learning, data is not available all together 
rather the data become available chunk by chunk over the time.


Now, my question is: does scikit-learn allows to have different data 
chunk or all the chunks has to be of the same size?


Thanks!

--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Issue in BIRCH clustering algo

2021-02-11 Thread Farzana Anowar

Hello everyone,

I was trying to run the BIRCH clustering algorithm. However, after 
fitting the model I am facing the following error:


AttributeError: '_CFSubcluster' object has no attribute 'sq_norm_'

This error occurs only after fitting the model and I couldn't find any 
proper explanation of this. Could anyone give me any suggestions on 
that? It would be really helpful.


Here is my code:

from sklearn.cluster import Birch

# Creating the BIRCH clustering model
model = Birch(n_clusters = None)

# Fit the data (Training)
model.fit(df)

# Predict the same data
pred = model.predict(df)

--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Batch Incremental Learning from Scikit-Multiflow

2022-02-25 Thread Farzana Anowar

Hello Scikit-learn community,

I hope you all are doing well!

I am currently working with BatchIncrementalClassifier from 
Scikit-multiflow package. For this BatchIncrementalClassifier, the 
following example is given:


# Setup a data stream
stream = SEAGenerator(random_state=1)

# Pre-training the classifier with 200 samples
X, y = stream.next_sample(200)
batch_incremental_cfier = BatchIncrementalClassifier()
batch_incremental_cfier.partial_fit(X, y)

# Preparing the processing of 5000 samples and correct prediction count
n_samples = 0
correct_cnt = 0
while n_samples < 5000 and stream.has_more_samples():
X, y = stream.next_sample()
y_pred = batch_incremental_cfier.predict(X)
if y[0] == y_pred[0]:
correct_cnt += 1
batch_incremental_cfier.partial_fit(X, y)
n_samples += 1
# Display results
print('Batch Incremental ensemble classifier example')
print('{} samples analyzed'.format(n_samples))
print('Performance: {}'.format(correct_cnt / n_samples))


Now my questions are:

1. For pre-training the model, the classifier used 200 samples from the 
stream, and then it does the prequential evaluation (test-train) on 5000 
samples. So, the 200 samples, are they considered as the 1st batch of 
data from the stream that is just used for pre-training and when the 2nd 
batch of data (5000) becomes available it does the evaluation based on 
the pre-train model??? (This makes sense to me, as in this way, we will 
have influence from the previous pre-trained model)


or

2. Is this one batch (200+5000) from the stream where 1st 200 samples 
have been used to pre-train and the rest of the samples are used for 
evaluation?? And when the next batch will arrive from the stream, will 
it does the same thing (200 for pre-training and the rest of them for 
evaluation)?? (If this is the case, are not we training from the scratch 
each time which does not keep the BatchIncrementalClassifier as an 
incremental classifier anymore?)



Thanks!

--
Best Regards,

Farzana Anowar,
PhD Candidate
Department of Computer Science
University of Regina
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn