Make job configuration object properly serialized to backend in Pig's LoadFunc
------------------------------------------------------------------------------

                 Key: PIG-1337
                 URL: https://issues.apache.org/jira/browse/PIG-1337
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.6.0
            Reporter: Chao Wang
             Fix For: 0.8.0


The Zebra storage layer needs to use distributed cache to reduce name node load 
during job runs.

To to this, Zebra needs to set up distributed cache related configuration 
information in TableLoader (which extends Pig's LoadFunc) .
It is doing this within getSchema(conf). The problem is that the conf object 
here is not the one that is being serialized to map/reduce backend. As such, 
the distributed cache is not set up properly.

To work over this problem, we need Pig in its LoadFunc to ensure a way that we 
can use to set up distributed cache information in a conf object, and this conf 
object is the one used by map/reduce backend.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to