Hi
I am
using python-2.7.3, numpy-1.6.2-win32-superpack-python2.7,
scipy-0.11.0rc1-win32-superpack-python2.7,
scikit-learn-0.11.win32-py2.7
I
tried the following
>>>
train_set = ("The sky is blue.", "The sun is bright.")
>>>
test_set = ("The sun in the sky is bright.",
"We
can see the shining sun, the bright sun.")
>>>
from sklearn.feature_extraction.text import CountVectorizer
>>>
vectorizer = CountVectorizer()
>>>
print vectorizer
CountVectorizer(analyzer=word,
binary=False, charset=utf-8,
charset_error=strict, dtype=<type
'long'>, input=content,
lowercase=True, max_df=1.0,
max_features=None, max_n=1, min_n=1,
preprocessor=None, stop_words=None,
strip_accents=None,
token_pattern=\b\w\w+\b,
tokenizer=None, vocabulary=None)
>>>
vectorizer.fit_transform(train_set)
<2x6
sparse matrix of type '<type 'numpy.int64'>'
with 8 stored elements in COOrdinate
format>
>>>
print vectorizer.vocabulary
Traceback
(most recent call last):
File "<pyshell#6>", line 1,
in <module>
print vectorizer.vocabulary
AttributeError:
'CountVectorizer' object has no attribute 'vocabulary'
>>>
I tried to fix the parameters of CountVectorizer (analyzer = WordNGramAnalyzer,
vocabulary = dict) but
it didn’t work. Therefore I decided to install sklearn 0.9 and it works, so we
could say that everything is OK but I still would like to know what is wrong
with version sklearn 0.11
Andrés Soto
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general