checkSchema should be called as long as schema is known for the relation to store. Typically we store schema in UDFContext inside checkSchema, and read it back when needed. You can refer to OrcStorage for reference: http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/builtin/OrcStorage.java?view=markup
On Wed, Aug 6, 2014 at 11:19 AM, Rodrick <[email protected]> wrote: > Hi, > I would like to create a StoreFunc link MultiStorage that instead of > referencing fields to be added to the output path by index, it references > them by name (it would construct a map between names and indexes based on the > schema of the data to be output). Is there a mechanism for a StoreFunc to > access the schema of the data being stored? I thought overriding checkSchema > would do this, but it does not seem to be called in all cases. > Thank you,Rodrick > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
