Hi Ted, I have created a jira item for this. But I can not assign this task to myself. So do I have permission to work on this one and submit patch ?
Thank you Jeff Zhang On Tue, Nov 17, 2009 at 1:35 AM, Ted Dunning <[email protected]> wrote: > Jeff, > > Glad to hear you are looking at Mahout. > > Practically speaking, it probalby isn't feasible to have an hbase column > per > matrix column. That makes storage of matrix data in hbase somewhat less > compelling, although clearly still very useful for some applications. > > As Grant pointed out, Mahout is trying to stay pretty agnostic relative to > data storage methods. Some people need to read matrices from Lucene > indexes, others from files, still others from hbase. We need to support > all > of those options. > > Your suggestion about making sure that Taste supports hbase is a good one. > > On Mon, Nov 16, 2009 at 12:54 AM, Jeff Zhang <[email protected]> wrote: > > > Then we can store them as one hbase row: > > A: {tilte:love=>1, > > content:I=>1,content:love=>1,content:this=>1,content:game=>1} > > > > > > Using hbase, it will be very easy for us to compute the similarity > between > > documents. > > And another advantage of hbase compared to raw text data is that it's > > semi-structured. And I think it will be easy for programming if we use > > hbase > > rather than the raw data. > > > > > > -- > Ted Dunning, CTO > DeepDyve >
