Pradeep Kamath updated PIG-901:

    Status: Patch Available  (was: Open)

PIG-901-trunk.patch is for the trunk. The change is in SliceWrapper to 
serialize ExecType only instead of PigContext since only the ExecType from the 
PigContext is used on deserialization. The package import list which Daniel 
referred to is a static member of PigContext which is explicitly set in 
SliceWrapper.makeRecordReader() and hence is taken care of.

It is a good suggestion to include a test case to check that even with a 
sizeable PigContext, we actually create small input splits. However to do this 
in the current Pig code layout means opening up PigServer and 
JobControlCompiler so that we can compile a pig script upto job creation and 
then instead of submitting the job to hadoop, instatiate PigInputFormat with 
the jobConf and get the Input Splits. This may require some design changes 
which we should address at some point for these kinds of tests. For now there 
is regression test in the patch to ensure the package import list is correctly 
handled and we have manually tested to ensure the split size is small (order of 

> InputSplit (SliceWrapper) created by Pig is big in size due to serialized 
> PigContext
> ------------------------------------------------------------------------------------
>                 Key: PIG-901
>                 URL: https://issues.apache.org/jira/browse/PIG-901
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.3.1
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.4.0
>         Attachments: PIG-901-1.patch, PIG-901-branch-0.3.patch, 
> PIG-901-trunk.patch
> InputSplit (SliceWrapper) created by Pig is big in size due to serialized 
> PigContext. SliceWrapper only needs ExecType - so the entire PigContext 
> should not be serialized and only the ExecType should be serialized.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to