Hi all,
From HashingVectorizer's document, it said:
- there is no way to compute the inverse transform (from feature indices to
string feature names) which can be a problem when trying to introspect
which features are most important to a model.
but i'm wondering if i can keep the mapping somewhere else to do the inverse
transform? e.g.
i can just get the indices from hashingvectorizer.transform([text]).nonzero()
and then get the
words from text or pass a dictionary to hashingvectorizer.transform to make
sure words/indices are
in consistent order.
one problem with it is there can be collisions, so different words can map to
same indices, but
i think it's quite rare, especially if only want to get the most important
feature from single document.
what do you think?
Best Regards,
--Terry------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general