[ 
https://issues.apache.org/jira/browse/PIG-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597328#comment-14597328
 ] 

Ángel Álvarez commented on PIG-4443:
------------------------------------

I'm only getting this error with Pig on Tez (on Mapreduce works fine) and it is 
while submitting the job ("Cannot submit DAG").

I'm using HDP 2.2.0.0-2041, so in order to test Pig 15, I've built this version 
from its sources, packaged it and added its dependencies to an unique zip file 
and uploaded to my HDFS. I'm also using LzoCodec to compress the intermediate 
temporary files (but it doesn't work without using LzoCodec either).

I execute these commands to run my script:

export JAVA_HOME=/usr/jdk64/jdk1.7.0_67
export HDP_VERSION=2.2.0.0-2041
export HADOOP_HOME=/usr/hdp/$HDP_VERSION/hadoop
export HIVE_HOME=/usr/hdp/$HDP_VERSION/hive
export HCAT_HOME=/usr/hdp/$HDP_VERSION/hive-hcatalog
export PIG_OPTS="-DUSE_TEZ_SESSION=true 
-Dtez.lib.uris=/hdp/apps/2.2.0.0-2041/tez-0.15.0/tez.tar.gz 
-Dpig.tmpfilecompression=true -Dpig.tmpfilecompression.codec=lzo 
-Dtez.runtime.intermediate-output.should-compress=true 
-Dtez.runtime.intermediate-output.is-compressed=true 
-Dtez.runtime.intermediate-output.compress.codec=com.hadoop.compression.lzo.LzoCodec
 -Dtez.runtime.intermediate-input.is-compressed=true 
-Dtez.runtime.intermediate-input.compress.codec=com.hadoop.compression.lzo.LzoCodec
 -Dtez.runtime.compress.codec=com.hadoop.compression.lzo.LzoCodec"

./pig-0.15.0-src/bin/pig -useHCatalog -x tez -f myscript.pig 


> Write inputsplits in Tez to disk if the size is huge and option to compress 
> pig input splits
> --------------------------------------------------------------------------------------------
>
>                 Key: PIG-4443
>                 URL: https://issues.apache.org/jira/browse/PIG-4443
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.15.0
>
>         Attachments: PIG-4443-1.patch, PIG-4443-Fix-TEZ-2192-2.patch, 
> PIG-4443-Fix-TEZ-2192.patch
>
>
> Pig sets the input split information in user payload and when running against 
> a table with 10s of 1000s of partitions, DAG submission fails with
> java.io.IOException: Requested data length 305844060 is longer than maximum
> configured RPC length 67108864



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to