[ 
https://issues.apache.org/jira/browse/PIG-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600532#comment-14600532
 ] 

Ángel Álvarez commented on PIG-4443:
------------------------------------

*MASTER_NODE LOG* (from where I'm executing Pig)
{noformat}
2015-06-25 03:32:48,978 [PigTezLauncher-0] ERROR 
org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Cannot submit DAG - 
Application id: application_1434957727568_0731
org.apache.tez.dag.api.TezException: com.google.protobuf.ServiceException: 
java.io.EOFException: End of File Exception between local host is: 
"MASTER_NODE"; destination host is: "AM_NODE":40698; : java.io.EOFException; 
For more details see:  http://wiki.apache.org/hadoop/EOFException
        at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:476)
        at org.apache.tez.client.TezClient.submitDAG(TezClient.java:391)
        at 
org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:161)
        at 
org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:187)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.protobuf.ServiceException: java.io.EOFException: End of 
File Exception between local host is: "MASTER_NODE"; destination host is: 
"AM_NODE":40698; : java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:246)
        at com.sun.proxy.$Proxy32.submitDAG(Unknown Source)
        at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:469)
        ... 8 more
Caused by: java.io.EOFException: End of File Exception between local host is: 
"MASTER_NODE"; destination host is: "AM_NODE":40698; : java.io.EOFException; 
For more details see:  http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        ... 10 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
{noformat}

*AM_NODE LOG* (the requested data length may have changed because I deleted and 
imported again two of the Hive tables)
{noformat}
2015-06-25 03:22:22,437 WARN [Socket Reader #1 for port 42481] ipc.Server: 
Requested data length 158480507 is longer than maximum configured RPC length 
67108864.  RPC came from MASTER_NODE
2015-06-25 03:22:22,438 INFO [Socket Reader #1 for port 42481] ipc.Server: 
Socket Reader #1 for port 42481: readAndProcess from client MASTER_NODE threw 
exception [java.io.IOException: Requested data length 158480507 is longer than 
maximum configured RPC length 67108864.  RPC came from MASTER_NODE]
java.io.IOException: Requested data length 158480507 is longer than maximum 
configured RPC length 67108864.  RPC came from MASTER_NODE
        at 
org.apache.hadoop.ipc.Server$Connection.checkDataLength(Server.java:1459)
        at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1521)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:762)
        at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:636)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:607)
{noformat}
*SPLITS AND SIZES* (all the serialized sizes are less than the 
spillThreshold=33554432)
{noformat}
2015-06-25 03:21:34,277 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 9, SerializedSize: 
1163342
2015-06-25 03:21:34,570 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 28, SerializedSize: 
2207068
2015-06-25 03:21:34,858 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 28, SerializedSize: 
1479632
2015-06-25 03:21:35,517 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 85, SerializedSize: 
17841999
2015-06-25 03:21:35,773 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 1, SerializedSize: 
191480
2015-06-25 03:21:36,087 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 3, SerializedSize: 
581998
2015-06-25 03:21:36,475 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 11, SerializedSize: 
2580419
2015-06-25 03:21:36,897 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 1, SerializedSize: 
166474
2015-06-25 03:21:37,337 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 4, SerializedSize: 
936317
2015-06-25 03:21:37,780 [main] INFO  
org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 1, SerializedSize: 
130474
{noformat}

I'm getting the same error in another Pig script with 18 HCatLoaders but both 
scripts work fine in Pig 14/Tez 0.5.2.


> Write inputsplits in Tez to disk if the size is huge and option to compress 
> pig input splits
> --------------------------------------------------------------------------------------------
>
>                 Key: PIG-4443
>                 URL: https://issues.apache.org/jira/browse/PIG-4443
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.15.0
>
>         Attachments: PIG-4443-1.patch, PIG-4443-Fix-TEZ-2192-2.patch, 
> PIG-4443-Fix-TEZ-2192.patch
>
>
> Pig sets the input split information in user payload and when running against 
> a table with 10s of 1000s of partitions, DAG submission fails with
> java.io.IOException: Requested data length 305844060 is longer than maximum
> configured RPC length 67108864



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to