[Scikit-learn-general] inverse transform in HashingVectorizer

Terry Peng Tue, 09 Apr 2013 00:49:07 -0700

Hi all,

From HashingVectorizer's document, it said:


    - there is no way to compute the inverse transform (from feature indices to
      string feature names) which can be a problem when trying to introspect
      which features are most important to a model.

but i'm wondering if i can keep the mapping somewhere else to do the inverse 
transform? e.g.
i can just get the indices from hashingvectorizer.transform([text]).nonzero() 
and then get the 
words from text or pass a dictionary to hashingvectorizer.transform to make 
sure words/indices are 
in consistent order.

one problem with it is there can be collisions, so different words can map to 
same indices, but 
i think it's quite rare, especially if only want to  get the most important 
feature from single document.

what do you think?

Best Regards,
--Terry

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] inverse transform in HashingVectorizer

Reply via email to