If the arbitrary objects you refer to fit nicely into pig's notion of tuples/bags/maps/primitives, then you can directly use that.

Otherwise, due to limited support for complex/arbitrary objects in pig schema (no support for something like Writable for example), you will most probably need to treat the object's as bytearray (assuming they are serializable) and covert to/from byte[] as part of their use. Pig currently does not allow you to decouple an object from its serialization.


Regards,
Mridul

On Thursday 07 April 2011 07:00 AM, Mark wrote:
If I wanted to load arbitrary objects into some tuples what classes
should I be looking at? Would I need some of storage class?

For example I have data file with out that contains
org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns. I
would like to iterate over them using pig using something like:

rows = LOAD 'data' using TopKStringPatternsStorage();

Is this correct? Is there any wiki on creating storages? Is there
anything I should look out for?

Thanks for the pointers

Reply via email to