pig-user  

Re: Tuple and Datum implementations

Alan Gates
Thu, 25 Sep 2008 08:45:24 -0700



Pete Wyckoff wrote:
For #1.4, could I not implement a new storage implementation and when given
the file name, I choose the deserialization/serialization mechanism? This
would not allow me to hide the location of the file from the user, but would
still have the benefit of the storage implementation hiding the details of
the deserialization.
In the short term you could. Eventually we'd like to be able to decouple the loading from the metadata for those who want to use external metadata sources so that they aren't forced to reimplement all the load and store functions.
For #2, yes I see, I don't want to implement the full Bag API, just want to
construct a default data bag from a Set or a List native object.

As for Describe, I would mean on a symbolic name - presumably a name
returned by a "show" command I would also want to implement - both with
basically mysql semantics.
In my earlier mail I was thinking mainly of file level metadata (schema, etc.) Here you're proposing grid level metadata. We talked about being able to do things like show and describe on "tables" instead of on files, but haven't fleshed it out yet. I think we all agree it's something we'll want to be able to do.

Alan.