On Wed, Oct 16, 2013 at 12:15 AM, Gabriel Reid <[email protected]>wrote:
> On Wed, Oct 16, 2013 at 8:46 AM, Josh Wills <[email protected]> wrote: > > > On Tue, Oct 15, 2013 at 11:42 PM, Chao Shi <[email protected]> wrote: > > > > > I don't understand why needs another PTypeFamily here. I think we can > > > simply provide some pre-defined PTypes. > > > > > > interface HFilePTypes { > > > static KEY_VALUE_PTYPE = xxx > > > static PUT_PTYPE = xxx > > > } > > > > > > > Technically, every PType has to provide an implementation of the > > PTypeFamily getFamily() method-- even if it's just returning a dummy > > object. > > > > > Wouldn't a derived PType (like in o.a.c.types.PTypes) be a better fit here? > That was my initial attempt, and in an ideal world, my preferred solution-- but I haven't figured out how to make it work. The question here is: what do I derive a KeyValue object to? What I really want, for purposes of reading it/writing it to one of our HBase IO formats, is to map it to itself, and not some subclass of Writable. Another option might be an extension of WritableType to handle these special case formats-- I'll take a crack at getting that to work. > A whole new PTypeFamily sounds like a lot of work (unless maybe if it was a > subclass of one of the existing ones), and I think there's still a fair bit > of code > that assumes that Avro & Writable are the only two possible PTypeFamily > implementations. > For any kind of intermediate processing, that is still true. The HBaseTypeFamily would only ever really appear at the input or output for a job. > > - Gabriel > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
