Re: using "-libjars" in Hadoop 2.2.1

Kim Chew Wed, 16 Apr 2014 13:39:12 -0700

Thanks Rahman. This problem can be boiled down to how to submit a job
compiled with Hadoop-1.1.1 remotely to a Hadoop 2 cluster that has not
turned on YARN. I will open another thread for it.


Kim


On Wed, Apr 16, 2014 at 1:30 PM, Abdelrahman Shettia <
[email protected]> wrote:

> Hi Kim,
>
> You can try to grep on the RM java process by running the following
> command:
>
> ps aux | grep
>
>
>
>
> On Wed, Apr 16, 2014 at 10:31 AM, Kim Chew <[email protected]> wrote:
>
>> Thanks Rahman, I have mixed things up a little bit in my mapred-site.xml
>> so it tried to run the job locally. Now I am running into the problem that
>> Rahul has, I am unable to to connect to the ResourceManager.
>>
>> The setup of my targeted cluster runs MR1 instead of YARN, hence the "
>> mapreduce.framework.name" is set to "classic".
>>
>> Here are my settings in my mapred-site.xml on the client side.
>>
>> <property>
>>     <!-- Pointed to the remote JobTracker -->
>>         <name>mapreduce.job.tracker.address</name>
>>         <value>172.31.3.150:8021</value>
>>     </property>
>>     <property>
>>         <name>mapreduce.framework.name</name>
>>         <value>yarn</value>
>>     </property>
>>
>> and my yarn-site.xml
>>
>>        <property>
>>             <description>The hostname of the RM.</description>
>>             <name>yarn.resourcemanager.hostname</name>
>>             <value>172.31.3.150</value>
>>         </property>
>>
>>         <property>
>>             <description>The address of the applications manager
>> interface in the RM.</description>
>>             <name>yarn.resourcemanager.address</name>
>>             <value>${yarn.resourcemanager.hostname}:8032</value>
>>         </property>
>>
>> 14/04/16 10:23:02 INFO client.RMProxy: Connecting to ResourceManager at /
>> 172.31.3.150:8032
>> 14/04/16 10:23:10 INFO ipc.Client: Retrying connect to server:
>> hadoop-host1.eng.narus.com/172.31.3.150:8032. Already tried 0 time(s);
>> retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
>> sleepTime=1 SECONDS)
>>
>> Therefore, the question is how do I figure out where the ResourceManager
>> is running?
>>
>> TIA
>>
>> Kim
>>
>>
>>
>>  On Wed, Apr 16, 2014 at 8:43 AM, Abdelrahman Shettia <
>> [email protected]> wrote:
>>
>>>  Hi Kim,
>>>
>>> It looks like it is pointing to hdfs location. Can you create the hdfs
>>> dir and put the jar there? Hope this helps
>>> Thanks,
>>> Rahman
>>>
>>> On Apr 16, 2014, at 8:39 AM, Rahul Singh <[email protected]>
>>> wrote:
>>>
>>> any help...all are welcome?
>>>
>>>
>>> On Wed, Apr 16, 2014 at 1:13 PM, Rahul Singh <[email protected]
>>> > wrote:
>>>
>>>> Hi,
>>>>  I am running with the following command but still, jar is not
>>>> available to mapper and reducers.
>>>>
>>>> hadoop jar /home/hduser/workspace/Minerva.jar my.search.Minerva
>>>> /user/hduser/input_minerva_actual /user/hduser/output_merva_actual3
>>>> -libjars /home/hduser/Documents/Lib/json-simple-1.1.1.jar
>>>> -Dmapreduce.user.classpath.first=true
>>>>
>>>>
>>>> Error Log
>>>>
>>>> 14/04/16 13:08:37 INFO client.RMProxy: Connecting to ResourceManager at
>>>> /0.0.0.0:8032
>>>> 14/04/16 13:08:37 INFO client.RMProxy: Connecting to ResourceManager at
>>>> /0.0.0.0:8032
>>>> 14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line
>>>> option parsing not performed. Implement the Tool interface and execute your
>>>> application with ToolRunner to remedy this.
>>>> 14/04/16 13:08:37 INFO mapred.FileInputFormat: Total input paths to
>>>> process : 1
>>>> 14/04/16 13:08:37 INFO mapreduce.JobSubmitter: number of splits:10
>>>> 14/04/16 13:08:37 INFO mapreduce.JobSubmitter: Submitting tokens for
>>>> job: job_1397534064728_0028
>>>> 14/04/16 13:08:38 INFO impl.YarnClientImpl: Submitted application
>>>> application_1397534064728_0028
>>>> 14/04/16 13:08:38 INFO mapreduce.Job: The url to track the job:
>>>> http://L-Rahul-Tech:8088/proxy/application_1397534064728_0028/<http://l-rahul-tech:8088/proxy/application_1397534064728_0028/>
>>>> 14/04/16 13:08:38 INFO mapreduce.Job: Running job:
>>>> job_1397534064728_0028
>>>> 14/04/16 13:08:47 INFO mapreduce.Job: Job job_1397534064728_0028
>>>> running in uber mode : false
>>>> 14/04/16 13:08:47 INFO mapreduce.Job:  map 0% reduce 0%
>>>> 14/04/16 13:08:58 INFO mapreduce.Job: Task Id :
>>>> attempt_1397534064728_0028_m_000005_0, Status : FAILED
>>>> Error: java.lang.RuntimeException: Error in configuring object
>>>>     at
>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>>>>     at
>>>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>>>>     at
>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>>>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>>>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at javax.security.auth.Subject.doAs(Subject.java:416)
>>>>     at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>>>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>     at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>     at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>     at java.lang.reflect.Method.invoke(Method.java:622)
>>>>     at
>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>>>>     ... 9 more
>>>> Caused by: java.lang.NoClassDefFoundError:
>>>> org/json/simple/parser/ParseException
>>>>     at java.lang.Class.forName0(Native Method)
>>>>     at java.lang.Class.forName(Class.java:270)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1821)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1786)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1880)
>>>>     at
>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1906)
>>>>     at
>>>> org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:1107)
>>>>     at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>>>>     ... 14 more
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> org.json.simple.parser.ParseException
>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
>>>>     ... 22 more
>>>>
>>>> When i analyzed the logs it says
>>>> "14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line
>>>> option parsing not performed. Implement the Tool interface and execute your
>>>> application with ToolRunner to remedy this."
>>>>
>>>> But i have implemented the tool class as described below:
>>>>
>>>> package my.search;
>>>>
>>>> import org.apache.hadoop.conf.Configured;
>>>> import org.apache.hadoop.fs.Path;
>>>> import org.apache.hadoop.io.Text;
>>>> import org.apache.hadoop.mapred.FileInputFormat;
>>>> import org.apache.hadoop.mapred.FileOutputFormat;
>>>> import org.apache.hadoop.mapred.JobClient;
>>>> import org.apache.hadoop.mapred.JobConf;
>>>> import org.apache.hadoop.mapred.TextInputFormat;
>>>> import org.apache.hadoop.mapred.TextOutputFormat;
>>>> import org.apache.hadoop.util.Tool;
>>>> import org.apache.hadoop.util.ToolRunner;
>>>>
>>>> public class Minerva extends Configured implements Tool
>>>> {
>>>>     public int run(String[] args) throws Exception {
>>>>         JobConf conf = new JobConf(Minerva.class);
>>>>         conf.setJobName("minerva sample job");
>>>>
>>>>         conf.setMapOutputKeyClass(Text.class);
>>>>         conf.setMapOutputValueClass(TextArrayWritable.class);
>>>>
>>>>         conf.setOutputKeyClass(Text.class);
>>>>         conf.setOutputValueClass(Text.class);
>>>>
>>>>         conf.setMapperClass(Map.class);
>>>>         // conf.setCombinerClass(Reduce.class);
>>>>         conf.setReducerClass(Reduce.class);
>>>>
>>>>         conf.setInputFormat(TextInputFormat.class);
>>>>         conf.setOutputFormat(TextOutputFormat.class);
>>>>
>>>>         FileInputFormat.setInputPaths(conf, new Path(args[0]));
>>>>         FileOutputFormat.setOutputPath(conf, new Path(args[1]));
>>>>
>>>>         JobClient.runJob(conf);
>>>>
>>>>         return 0;
>>>>     }
>>>>
>>>>     public static void main(String[] args) throws Exception {
>>>>         int res = ToolRunner.run(new Minerva(), args);
>>>>         System.exit(res);
>>>>     }
>>>> }
>>>>
>>>>
>>>> Please let me know if you see any issues?
>>>>
>>>>
>>>>
>>>> On Thu, Apr 10, 2014 at 9:29 AM, Shengjun Xin <[email protected]>wrote:
>>>>
>>>>> add '-Dmapreduce.user.classpath.first=true' to your command and try
>>>>> again
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Apr 9, 2014 at 6:27 AM, Kim Chew <[email protected]> wrote:
>>>>>
>>>>>> It seems to me that in Hadoop 2.2.1, using the "libjars" option does
>>>>>> not search the jars located in the the local file system but HDFS. For
>>>>>> example,
>>>>>>
>>>>>> hadoop jar target/myJar.jar Foo -libjars
>>>>>> /home/kchew/test-libs/testJar.jar /user/kchew/inputs/raw.vector
>>>>>> /user/kchew/outputs hdfs://remoteNN:8020 remoteJT:8021
>>>>>>
>>>>>> 14/04/08 15:11:02 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>>>> processName=JobTracker, sessionId=
>>>>>> 14/04/08 15:11:02 INFO mapreduce.JobSubmitter: Cleaning up the
>>>>>> staging area
>>>>>> file:/tmp/hadoop-kchew/mapred/staging/kchew202924688/.staging/job_local202924688_0001
>>>>>> 14/04/08 15:11:02 ERROR security.UserGroupInformation:
>>>>>> PriviledgedActionException as:kchew (auth:SIMPLE)
>>>>>> cause:java.io.FileNotFoundException: File does not exist:
>>>>>> hdfs://remoteNN:8020/home/kchew/test-libs/testJar.jar
>>>>>> java.io.FileNotFoundException: File does not exist:
>>>>>> hdfs:/remoteNN:8020/home/kchew/test-libs/testJar.jar
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1110)
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>>>>>>     at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>     at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>     at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
>>>>>>     at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>     at
>>>>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:264)
>>>>>>
>>>>>> So under Hadoop 2.2.1, do I have to explicitly set some
>>>>>> configurations so when using the "libjars" option it will copy the file 
>>>>>> to
>>>>>> hdfs from local fs?
>>>>>>
>>>>>> TIA
>>>>>>
>>>>>> Kim
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards
>>>>> Shengjun
>>>>>
>>>>
>>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: using "-libjars" in Hadoop 2.2.1

Reply via email to