https://scikit-learn.org/stable/modules/svm.html
Of the svm classes mentioned above, which sparse matrixes are
appropriate to be used with them?
https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html#scipy.sparse.csr_matrix
It is not very clear what matrix operations
There is no such code. You need to make sure that the normalisation you use
matches the normalisation applied when constructing a stop word list.
Unfortunately we do not provide for this directly, and it is not easy to do
so in the general case.
___
Hi,
https://github.com/scikit-learn/scikit-learn/blob/002f891a33b612be389d9c488699db5689753ef4/sklearn/feature_extraction/text.py#L587
The default of lowercase is True. But stopwords are lower case. Where
is the code to make sure the stop words are removed when they are not
in lower case?
Yes, ONNX is an appropriate solution when exporting models for prediction.
See http://scikit-learn.org/stable/modules/model_persistence.html
On Tue, 28 Jan 2020 at 23:03, Christopher.samiullah via scikit-learn <
scikit-learn@python.org> wrote:
> Dear admins,
>
>
> I recently encountered an issue
Dear admins,
> I recently encountered an issue attempting to load a model persisted via
> joblib dump on different Python architectures. I wrote up the issue here on
> stackoverflow:
>
> Are you concerned about storing the whole corpus text in memory, or the
> whole corpus' statistics? If the text, use input='file' or input='filename'
> (or a generator of texts).
I am not really sure which stage takes the most memory as my program
kills itself due to memory limitation. But I
Are you concerned about storing the whole corpus text in memory, or the
whole corpus' statistics? If the text, use input='file' or input='filename'
(or a generator of texts).
On Tue, 28 Jan 2020 at 18:01, Peng Yu wrote:
> Hi,
>
> To use TfidfVectorizer, the whole corpus must be used into