Sure Zhang. Thanks for help.
-Rahul On Aug 26, 2010, at 8:17 PM, Jeff Zhang wrote: > It's weird. I doubt maybe there's other configuration file on your > class path which override your real conf files. > Could you download a new pig release and follow the instructions on > http://hadoop.apache.org/pig/docs/r0.7.0/setup.html on a new > environment ? > > > > On Thu, Aug 26, 2010 at 7:49 PM, rahul <rmalv...@apple.com> wrote: >> Hi , >> >> I tried the grunt shell as well but that also does not connects to hadoop. >> It throws a warning and runs the job in standalone mode. So it tried it >> using the pig.jar. >> >> Do you have any further suggestion on that ? >> >> Rahul >> >> On Aug 26, 2010, at 7:23 PM, Jeff Zhang wrote: >> >>> Connect to 9001 is right, this is jobtracker's ipc port while 50030 >>> is its http server port. >>> And have you ever try to run the grunt shell ? >>> >>> On Thu, Aug 26, 2010 at 7:12 PM, rahul <rmalv...@apple.com> wrote: >>>> Hi Jeff, >>>> >>>> I can connect to the jobtracker web UI using the following URL : >>>> http://localhost:50030/jobtracker.jsp >>>> >>>> And also I can see jobs which I ran directly using the streaming api on >>>> hadoop. >>>> >>>> I also see it tries to connect to localhost/127.0.0.1:9001 which I have >>>> specified in the hadoop conf file >>>> and I have also tried changing this location to localhost:50030 but still >>>> the error remains the same. >>>> >>>> Can you suggest something further ? >>>> >>>> Thanks, >>>> Rahul >>>> >>>> On Aug 26, 2010, at 7:07 PM, Jeff Zhang wrote: >>>> >>>>> Can you look at the jobtracker log or access jobtracker web ui ? >>>>> It seems you can not connect to jobtracker according your log >>>>> >>>>> "Caused by: java.io.IOException: Call to localhost/127.0.0.1:9001 >>>>> failed on local exception: java.io.EOFException" >>>>> >>>>> >>>>> >>>>> On Fri, Aug 27, 2010 at 10:00 AM, rahul <rmalv...@apple.com> wrote: >>>>>> Yes they are running. >>>>>> >>>>>> On Aug 26, 2010, at 6:59 PM, Jeff Zhang wrote: >>>>>> >>>>>>> Execute command jps in shell to see whether namenode and jobtracker is >>>>>>> running correctly. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 27, 2010 at 9:49 AM, rahul <rmalv...@apple.com> wrote: >>>>>>>> Hi Jeff, >>>>>>>> >>>>>>>> I transferred the hadoop conf files to the pig/conf location but still >>>>>>>> i get the same error. >>>>>>>> >>>>>>>> Does the issue is with the configuration files or with the hdfs files >>>>>>>> system ? >>>>>>>> >>>>>>>> Can test the connection to hdfs(localhost/127.0.0.1:9001) in some way ? >>>>>>>> >>>>>>>> Steps I did : >>>>>>>> >>>>>>>> 1. I have formatted initially my local file system using the ./hadoop >>>>>>>> namenode -format command. I believe this mounts the local file system >>>>>>>> to HDFS. >>>>>>>> 2. Then I configured the hadoop conf files and started ./start-all >>>>>>>> script. >>>>>>>> 3. Started Pig with a custom pig script which should read hdfs as I >>>>>>>> passed the HADOOP_CONF_DIR as parameter. >>>>>>>> The command was java -cp $PIGDIR/pig.jar:$HADOOP_CONF_DIR >>>>>>>> org.apache.pig.Main script1-hadoop.pig >>>>>>>> >>>>>>>> Please let me know if these step miss something ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Rahul >>>>>>>> >>>>>>>> >>>>>>>> On Aug 26, 2010, at 6:33 PM, Jeff Zhang wrote: >>>>>>>> >>>>>>>>> Try to put the hadoop xml configuration file to pig/conf folder >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 26, 2010 at 6:22 PM, rahul <rmalv...@apple.com> wrote: >>>>>>>>>> Hi Jeff, >>>>>>>>>> >>>>>>>>>> I have set the hadoop conf in class path by setting $HADOOP_CONF_DIR >>>>>>>>>> variable. >>>>>>>>>> >>>>>>>>>> But I have both Pig and hadoop running at the same machine, so >>>>>>>>>> localhost should not make a difference. >>>>>>>>>> >>>>>>>>>> So I have used all the default config setting for the core-site.xml, >>>>>>>>>> hdfs-site.xml, mapred-site.xml, as per the hadoop tutorial. >>>>>>>>>> >>>>>>>>>> Please let me know if my understanding is correct ? >>>>>>>>>> >>>>>>>>>> I am attaching the conf files as well : >>>>>>>>>> hdfs-site.xml: >>>>>>>>>> >>>>>>>>>> <?xml version="1.0"?> >>>>>>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>>>>>>>>> >>>>>>>>>> <!-- Put site-specific property overrides in this file. --> >>>>>>>>>> >>>>>>>>>> <configuration> >>>>>>>>>> <property> >>>>>>>>>> <name>fs.default.name</name> >>>>>>>>>> <value>hdfs://localhost:9000</value> >>>>>>>>>> <description>The name of the default file system. A URI whose >>>>>>>>>> scheme and authority determine the FileSystem implementation. The >>>>>>>>>> uri's scheme determines the config property (fs.SCHEME.impl) naming >>>>>>>>>> the FileSystem implementation class. The uri's authority is used to >>>>>>>>>> determine the host, port, etc. for a filesystem.</description> >>>>>>>>>> </property> >>>>>>>>>> >>>>>>>>>> <property> >>>>>>>>>> <name>dfs.replication</name> >>>>>>>>>> <value>1</value> >>>>>>>>>> <description>Default block replication. >>>>>>>>>> The actual number of replications can be specified when the file is >>>>>>>>>> created. >>>>>>>>>> The default is used if replication is not specified in create time. >>>>>>>>>> </description> >>>>>>>>>> </property> >>>>>>>>>> >>>>>>>>>> </configuration> >>>>>>>>>> >>>>>>>>>> core-site.xml >>>>>>>>>> <?xml version="1.0"?> >>>>>>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>>>>>>>>> >>>>>>>>>> <!-- Put site-specific property overrides in this file. --> >>>>>>>>>> >>>>>>>>>> <configuration> >>>>>>>>>> <property> >>>>>>>>>> <name>hadoop.tmp.dir</name> >>>>>>>>>> >>>>>>>>>> <value>/Users/rahulmalviya/Documents/Hadoop/hadoop-0.21.0/hadoop-${user.name}</value> >>>>>>>>>> <description>A base for other temporary directories.</description> >>>>>>>>>> </property> >>>>>>>>>> </configuration> >>>>>>>>>> >>>>>>>>>> mapred-site.xml >>>>>>>>>> <?xml version="1.0"?> >>>>>>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>>>>>>>>> >>>>>>>>>> <!-- Put site-specific property overrides in this file. --> >>>>>>>>>> >>>>>>>>>> <configuration> >>>>>>>>>> <property> >>>>>>>>>> <name>mapred.job.tracker</name> >>>>>>>>>> <value>localhost:9001</value> >>>>>>>>>> <description>The host and port that the MapReduce job tracker runs >>>>>>>>>> at. If "local", then jobs are run in-process as a single map >>>>>>>>>> and reduce task. >>>>>>>>>> </description> >>>>>>>>>> </property> >>>>>>>>>> >>>>>>>>>> <property> >>>>>>>>>> <name>mapred.tasktracker.tasks.maximum</name> >>>>>>>>>> <value>8</value> >>>>>>>>>> <description>The maximum number of tasks that will be run >>>>>>>>>> simultaneously by a >>>>>>>>>> a task tracker >>>>>>>>>> </description> >>>>>>>>>> </property> >>>>>>>>>> </configuration> >>>>>>>>>> >>>>>>>>>> Please let me know if there is a issue in my configurations ? Any >>>>>>>>>> input is valuable for me. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Rahul >>>>>>>>>> >>>>>>>>>> On Aug 26, 2010, at 6:10 PM, Jeff Zhang wrote: >>>>>>>>>> >>>>>>>>>>> Do you put the hadoop conf on classpath ? It seems you are still >>>>>>>>>>> using >>>>>>>>>>> local file system but conncect Hadoop's JobTracker. >>>>>>>>>>> Make sure you set the correct configuration in core-site.xml >>>>>>>>>>> hdfs-site.xml, mapred-site.xml, and put them on classpath. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 26, 2010 at 5:32 PM, rahul <rmalv...@apple.com> wrote: >>>>>>>>>>>> Hi , >>>>>>>>>>>> >>>>>>>>>>>> I am trying to integrate Pig with Hadoop for processing of jobs. >>>>>>>>>>>> >>>>>>>>>>>> I am able to run Pig in local mode and Hadoop with streaming api >>>>>>>>>>>> perfectly. >>>>>>>>>>>> >>>>>>>>>>>> But when I try to run Pig with Hadoop I get follwong Error: >>>>>>>>>>>> >>>>>>>>>>>> Pig Stack Trace >>>>>>>>>>>> --------------- >>>>>>>>>>>> ERROR 2116: Unexpected error. Could not validate the output >>>>>>>>>>>> specification for: >>>>>>>>>>>> file:///Users/rahulmalviya/Documents/Pig/dev/main_merged_hdp_out >>>>>>>>>>>> >>>>>>>>>>>> org.apache.pig.impl.plan.PlanValidationException: ERROR 0: An >>>>>>>>>>>> unexpected exception caused the validation to stop >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.PlanValidator.validate(PlanValidator.java:56) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileValidator.validate(InputOutputFileValidator.java:49) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileValidator.validate(InputOutputFileValidator.java:37) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:89) >>>>>>>>>>>> at org.apache.pig.PigServer.validate(PigServer.java:930) >>>>>>>>>>>> at org.apache.pig.PigServer.compileLp(PigServer.java:910) >>>>>>>>>>>> at org.apache.pig.PigServer.compileLp(PigServer.java:871) >>>>>>>>>>>> at org.apache.pig.PigServer.compileLp(PigServer.java:852) >>>>>>>>>>>> at org.apache.pig.PigServer.execute(PigServer.java:816) >>>>>>>>>>>> at org.apache.pig.PigServer.access$100(PigServer.java:105) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.PigServer$Graph.execute(PigServer.java:1080) >>>>>>>>>>>> at org.apache.pig.PigServer.executeBatch(PigServer.java:288) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:109) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) >>>>>>>>>>>> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) >>>>>>>>>>>> at org.apache.pig.Main.main(Main.java:391) >>>>>>>>>>>> Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR >>>>>>>>>>>> 2116: Unexpected error. Could not validate the output >>>>>>>>>>>> specification for: >>>>>>>>>>>> file:///Users/rahulmalviya/Documents/Pig/dev/main_merged_hdp_out >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileVisitor.visit(InputOutputFileVisitor.java:93) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.LOStore.visit(LOStore.java:140) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.LOStore.visit(LOStore.java:37) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:67) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.plan.PlanValidator.validate(PlanValidator.java:50) >>>>>>>>>>>> ... 16 more >>>>>>>>>>>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:9001 >>>>>>>>>>>> failed on local exception: java.io.EOFException >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.ipc.Client.wrapException(Client.java:775) >>>>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:743) >>>>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) >>>>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:429) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.mapred.JobClient.init(JobClient.java:423) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:410) >>>>>>>>>>>> at org.apache.hadoop.mapreduce.Job.<init>(Job.java:50) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileVisitor.visit(InputOutputFileVisitor.java:89) >>>>>>>>>>>> ... 24 more >>>>>>>>>>>> Caused by: java.io.EOFException >>>>>>>>>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) >>>>>>>>>>>> ================================================================================ >>>>>>>>>>>> >>>>>>>>>>>> Did anyone got the same error. I think it related to connection >>>>>>>>>>>> between pig and hadoop. >>>>>>>>>>>> >>>>>>>>>>>> Can someone tell me how to connect Pig and hadoop. >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best Regards >>>>>>>>>>> >>>>>>>>>>> Jeff Zhang >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards >>>>>>>>> >>>>>>>>> Jeff Zhang >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards >>>>>>> >>>>>>> Jeff Zhang >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards >>>>> >>>>> Jeff Zhang >>>> >>>> >>> >>> >>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >> >> > > > > -- > Best Regards > > Jeff Zhang