On 27/07/16 16:04, Jacob Wilder wrote:
> Hello everyone,
> I've tried a few things while working on this but would appreciate some
> suggestions on how to move forward. Or perhaps decide not to.
> 
> I've pushed a very limited version of this up here:
> https://github.com/jacobwil/incubator-pirk/compare/master...jacobwil:EnumForTypesInDataPartitioner
> 
> In this limited version we move knowledge of the size of each type (and
> appropriate number of partitions) into the enum making it easier to reuse
> that logic without reproducing entire getNumPartitions and getBits methods
> in each DataPartitioner implementation. I think this is worth keeping.

Cool.  It also gives some further typing rather than dealing with
strings everywhere.

> Basically, I made PrimitiveTypeEnum which holds each of the primitive
> types, can provide their bit size, and their number of partitions. Callers
> to PrimitiveTypePartitioner still provide types as strings which
> PrimitiveTypePartitioner converts into PrimitiveTypeEnums.
> 
> In my original attempt I took this further and started to change
> interface DataPartitioner into DataPartitioner<T extends TypeEnum> and
> PrimitiveTypePartitioner into PrimitiveTypePartitioner<PrimitiveTypeEnum>,
> etc.
> 
> I managed to push the string-to-enum conversion most of the way up to
> LoadDataSchemas. That's when I started to wrestle with just how much more
> complex this was becoming.

You're right that ideally the loader should be doing the string
conversion (i.e. move the safelyConvertTypeToEnum into the loader).
BTW if you do go there you should merge in the new schema and schema
loader code to avoid rework pain.

I like the idea of having the DataPartitioner<T extends TypeEnum> too,
but without trying it I'll take your word that it gets hairy.  If you
succeed you'd be able to change things like
  toPartitions(Object obj, String typeAsString)
to drop the string arg, right?

> While it's possible to have the instantiation of the DataPartitioner (as
> requested by the XML file) provide a factory method that converts
> string-represented-type information into the correct TypeEnum, I'd
> appreciate some feedback on if we consider this to truly be valuable.

I'm still trying to get my head around what it means for non-primitive
partitioners... in particular what the onus will be on people wanting to
add a new partitioner type.
 - implement a new TypeEnum
 - update the factory/whatever for the schema loader
 - anything else?

Regards,
Tim


Reply via email to