Vectorizing arbitrary value types with seq2sparse

Frank Scholten Fri, 06 May 2011 13:02:34 -0700

Hi everyone,

At the moment seq2sparse can generate vectors from sequence values of
type Text. More specifically, SequenceFileTokenizerMapper handles Text
values.


Would it be useful if seq2sparse could be configured to vectorize
value types such as a Blog article with several textual fields like
title, content, tags and so on?

Or is it easier to create a separate job for this or use Pig or
anything like that?

Frank

Vectorizing arbitrary value types with seq2sparse

Reply via email to