I would like to have a new attribute hashing_trace_ such as:
>>> vec = HashingVectorizer(tracing=True)
>>> vec.fit_transform([list_of_docs])
>>> vec.hashing_trace_
{4534: [('the', 344), ('rarercollidingtoken', 2)], 134:
[('someothertoken', 1)], }
`hashing_trace_` would be a dict of lists of t
Could you please elaborate more how to add this tracing mode?
i also realized my idea about keep the mapping from document tokens to non-zero
elements wouldn't work, since non-zero elements are not in the same order of
the original tokens.
use FeatureHasher to work on the individual word might w
2013/4/9 Terry Peng :
> Hi all,
>
> From HashingVectorizer's document, it said:
>
> - there is no way to compute the inverse transform (from feature indices
> to
> string feature names) which can be a problem when trying to introspect
> which features are most important to a model.
On 04/09/2013 09:48 AM, Terry Peng wrote:
Hi all,
what do you think?
If you want to store the dict, why not use the DictVectorizer?
--
Precog is a next-generation analytics platform capable of advanced
analytics on semi
Hi all,
From HashingVectorizer's document, it said:
- there is no way to compute the inverse transform (from feature indices to
string feature names) which can be a problem when trying to introspect
which features are most important to a model.
but i'm wondering if i can keep the