On Tue, Jul 7, 2009 at 18:57, stack <[email protected]> wrote: > 2009/7/7 Doğacan Güney <[email protected]> > > > Hi list, > > > > In current trunk, TableReducer is defined like this: > > > > .... > > public abstract class TableReducer<KEYIN, VALUEIN> > > extends Reducer<KEYIN, VALUEIN, ImmutableBytesWritable, Put> > > .... > > > > As VALUEOUT is a Put, I guess one can not delete columns (like we could > > do with BatchUpdate) using collect(). I can still create Delete-s in > > #reduce > > and > > do a table.delete but that seems unintuitive to me. Am I missing > something > > here > > or is this the intended behavior? > > > > Thats intended behavior for that class. Put and Delete do not share common > ancestor other than Writable so its a little awkward. > > What would you suggest Doğacan? Maybe we should add Marker interfaces to > Put and Delete and then change TableReducer to take the Marker? >
Sure, that's a good idea. I haven't studied hadoop 0.20's API much yet so I am not sure if this can be done but can hbase have its own ReduceContext class? If this is possible, then maybe we can just expose the HTable instance through the context and allow user to do whatever he wants to do on the table (and throw an exception if context.write is called) . I think this would be much more simpler to understand than the write/collect() calls (e.g TableOutputFormat ignores the collect-ed keys). Does this make sense? > > Now is a good time to bring this up before it gets set in stone by the > 0.20.0 release. > > Thanks for looking at this. > No problem :) Hbase 0.20 is shaping up to be really awesome, btw :) > > St.Ack > -- Doğacan Güney
