Hi 
I am
using python-2.7.3, numpy-1.6.2-win32-superpack-python2.7, 
scipy-0.11.0rc1-win32-superpack-python2.7,
scikit-learn-0.11.win32-py2.7
I
tried the following 
 
>>>
train_set = ("The sky is blue.", "The sun is bright.")
>>>
test_set = ("The sun in the sky is bright.",
"We
can see the shining sun, the bright sun.")
>>>
from sklearn.feature_extraction.text import CountVectorizer
>>>
vectorizer = CountVectorizer()
>>>
print vectorizer
CountVectorizer(analyzer=word,
binary=False, charset=utf-8,
        charset_error=strict, dtype=<type
'long'>, input=content,
        lowercase=True, max_df=1.0,
max_features=None, max_n=1, min_n=1,
        preprocessor=None, stop_words=None,
strip_accents=None,
        token_pattern=\b\w\w+\b,
tokenizer=None, vocabulary=None)
>>>
vectorizer.fit_transform(train_set)
<2x6
sparse matrix of type '<type 'numpy.int64'>'
            with 8 stored elements in COOrdinate
format>
>>>
print vectorizer.vocabulary
 
Traceback
(most recent call last):
  File "<pyshell#6>", line 1,
in <module>
    print vectorizer.vocabulary
AttributeError:
'CountVectorizer' object has no attribute 'vocabulary'
>>> 
 
I tried to fix the parameters of CountVectorizer (analyzer = WordNGramAnalyzer, 
vocabulary = dict) but
it didn’t work. Therefore I decided to install sklearn 0.9 and it works, so we
could say that everything is OK but I still would like to know what is wrong
with version sklearn 0.11
Andrés Soto
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to