[ 
https://issues.apache.org/jira/browse/PIG-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600838#comment-14600838
 ] 

Rohini Palaniswamy commented on PIG-4443:
-----------------------------------------

Ok. I get the problem. The sizes of splits of the 18 loaders add up and cross 
64 MB. Since this patch only checks that a single loader's split size does not 
exceed threshold, it does not fix the issue. Not sure how it could have worked 
with Pig 0.14. Only thing I can think of is that if the hadoop version has 
changed between when Pig 0.14 was installed and Pig 0.15 is installed. This RPC 
bounds check was only added in Hadoop 2.6 (HADOOP-10940). Can you check if that 
is the case?

If you had HIVE-9845, you would not be hitting this issue. From the pig side, 
you can try running with pig -Dpig.compress.input.splits=true and that should 
mostly fix it. If it still does not work, then you can try pig 
-Dipc.maximum.data.length=268435456 (256 MB).

> Write inputsplits in Tez to disk if the size is huge and option to compress 
> pig input splits
> --------------------------------------------------------------------------------------------
>
>                 Key: PIG-4443
>                 URL: https://issues.apache.org/jira/browse/PIG-4443
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.15.0
>
>         Attachments: PIG-4443-1.patch, PIG-4443-Fix-TEZ-2192-2.patch, 
> PIG-4443-Fix-TEZ-2192.patch
>
>
> Pig sets the input split information in user payload and when running against 
> a table with 10s of 1000s of partitions, DAG submission fails with
> java.io.IOException: Requested data length 305844060 is longer than maximum
> configured RPC length 67108864



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to