2012/6/15 Dinesh B Vadhia :
> The class CharNGramAnalyzer is documentated at
> http://scikit-learn.org/0.8/modules/generated/scikits.learn.feature_extraction.text.CharNGramAnalyzer.html#scikits.learn.feature_extraction.text.CharNGramAnalyzer.
That's the 0.8 documentation. The latest release is 0.1
xt.py
Dinesh
Date: Fri, 15 Jun 2012 11:31:53 +0200
From: Olivier Grisel
Subject: Re: [Scikit-learn-general] Customizing the vectorizer classes
... for Asian Languages
To: [email protected]
Me
2012/6/15 xinfan meng :
> The docs tell you that you canĀ customizeĀ an define a preprocessor to first
> segment the text if needed, e.g. in Chinese or Japanese. However, sklearn
> does not provide one such preprocessor. To see how you can implement one,
> the best way is to take a look at the codes.
The docs tell you that you can customize an define a preprocessor to first
segment the text if needed, e.g. in Chinese or Japanese. However, sklearn
does not provide one such preprocessor. To see how you can implement one,
the best way is to take a look at the codes. I think the text processing
pip