Hey all, To kill some time this afternoon, I took a pass at figuring out what changes would be needed in Crunch to support HBase 0.96, which is going through a few release candidates right now. I started out by building against the 0.95.2 release, which has most of the API changes that I'm told we can expect in 0.96.
The most consequential change I found is that many of the core HBase classes we operate on-- Put, Delete, KeyValue, and Result-- will no longer implement the Writable interface. Instead, the HBase team has added a number of SerializationFactory classes for these types, which map the POJO versions of those objects on to protocol buffers. This means that the current trick of creating PTypes for HBase like this: PType<Result> ptype = Writables.writables(Result.class); won't work anymore in 0.96, i.e., the HBase data classes won't fit into either of the existing type families. The best solution I've come up with so far is to create a new, HBase-specific PTypeFamily for supporting the way these classes are serialized now. I'm not sure if there's a better approach here and/or how complex this particular PTypeFamily implementation would need to be; I'm very much open to ideas on how to proceed here. J -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
