Any suggestions ?

On Thu, Mar 13, 2014 at 1:07 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]> wrote:

> Hello,
> I have a hadoop cluster upgraded to Hadoop 2.x and everything with it
> works fine. (Runs M/R jobs, able to perform actions on HDFS).
>
> When i run a pig script using pig grunt shell or pig -x mapreduce -f
> 'test.pig'.  In either case it connects to Hadoop cluster, starts M/R job,
> the M/R job completes fine. However the shell hangs and never returns.
>
>
> 2014-03-13 00:17:04,341 [JobControl] INFO  org.apache.hadoop.mapreduce.Job
> - The url to track the job:
> https://apollo-jt.vip.org.com:50030/proxy/application_1394582929977_7433/
> 2014-03-13 00:17:04,342 [main] INFO
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_1394582929977_7433
> 2014-03-13 00:17:04,342 [main] INFO
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Processing aliases A,B,C
> 2014-03-13 00:17:04,342 [main] INFO
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - detailed locations: M: A[4,4],C[6,4],B[5,4] C: C[6,4],B[5,4] R: C[6,4]
> 2014-03-13 00:17:04,365 [main] INFO
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
> 2014-03-13 00:17:36,232 [main] INFO  org.apache.hadoop.ipc.Client -
> Retrying connect to server: <AM_Host_Name>/<AM_Host_IP>:47718. Already
> tried 0 time(s); maxRetries=45
> 2014-03-13 00:17:36,232 [main] INFO  org.apache.hadoop.ipc.Client -
> Retrying connect to server: <AM_Host_Name>/<AM_Host_IP>:47718. Already
> tried 1 time(s); maxRetries=45
> 2014-03-13 00:17:36,232 [main] INFO  org.apache.hadoop.ipc.Client -
> Retrying connect to server: <AM_Host_Name>/<AM_Host_IP>:47718. Already
> tried 2 time(s); maxRetries=45
> ..
> ...
>
>
>
> On analyzing, i found that AM_Host_Name matches the Application Master of
> the M/R job.
> Question
> 1) Does the client machine attempts to connect to Application Master, in
> order to get the status of M/R Job ?
> 2) If #1 is true, and since its Hadoop 2.x secure cluster, does it mean it
> requires firewall to be open between client and application master (any
> node in the cluster) and PORT ?
> 3) I assumed #1 and #2 are true and hence got the firewall to be opened
> between client and all nodes in hadoop cluster (since anyone can be
> application master) for port 47718. However to my surprise i found that
> this 47718 port changed.
> Is there a setting or a group of port numbers that are used to communicate
> between client and AM in order to report status ? If yes where can i find
> this list ?
>
> 4) How do i get the grunt shell back and see the status/progress of job
> from client machine ?
>
>
>
> --
> Deepak
>
>


-- 
Deepak

Reply via email to