Re: using "-libjars" in Hadoop 2.2.1

Rahul Singh Wed, 16 Apr 2014 18:48:12 -0700

Please could anyone respond to my query above:


Why i am getting this warning?

14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line option
parsing not performed. Implement the Tool interface and execute your
application with ToolRunner to remedy this.

Because of this my libjar is not getting picked up and i am getting class
def not found error.

Thanks and Regards,
Rahul Singh


On Thu, Apr 17, 2014 at 2:08 AM, Kim Chew <[email protected]> wrote:

> Thanks Rahman. This problem can be boiled down to how to submit a job
> compiled with Hadoop-1.1.1 remotely to a Hadoop 2 cluster that has not
> turned on YARN. I will open another thread for it.
>
> Kim
>
>
> On Wed, Apr 16, 2014 at 1:30 PM, Abdelrahman Shettia <
> [email protected]> wrote:
>
>> Hi Kim,
>>
>> You can try to grep on the RM java process by running the following
>> command:
>>
>> ps aux | grep
>>
>>
>>
>>
>> On Wed, Apr 16, 2014 at 10:31 AM, Kim Chew <[email protected]> wrote:
>>
>>> Thanks Rahman, I have mixed things up a little bit in my mapred-site.xml
>>> so it tried to run the job locally. Now I am running into the problem that
>>> Rahul has, I am unable to to connect to the ResourceManager.
>>>
>>> The setup of my targeted cluster runs MR1 instead of YARN, hence the "
>>> mapreduce.framework.name" is set to "classic".
>>>
>>> Here are my settings in my mapred-site.xml on the client side.
>>>
>>> <property>
>>>     <!-- Pointed to the remote JobTracker -->
>>>         <name>mapreduce.job.tracker.address</name>
>>>         <value>172.31.3.150:8021</value>
>>>     </property>
>>>     <property>
>>>         <name>mapreduce.framework.name</name>
>>>         <value>yarn</value>
>>>     </property>
>>>
>>> and my yarn-site.xml
>>>
>>>        <property>
>>>             <description>The hostname of the RM.</description>
>>>             <name>yarn.resourcemanager.hostname</name>
>>>             <value>172.31.3.150</value>
>>>         </property>
>>>
>>>         <property>
>>>             <description>The address of the applications manager
>>> interface in the RM.</description>
>>>             <name>yarn.resourcemanager.address</name>
>>>             <value>${yarn.resourcemanager.hostname}:8032</value>
>>>         </property>
>>>
>>> 14/04/16 10:23:02 INFO client.RMProxy: Connecting to ResourceManager at /
>>> 172.31.3.150:8032
>>> 14/04/16 10:23:10 INFO ipc.Client: Retrying connect to server:
>>> hadoop-host1.eng.narus.com/172.31.3.150:8032. Already tried 0 time(s);
>>> retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
>>> sleepTime=1 SECONDS)
>>>
>>> Therefore, the question is how do I figure out where the ResourceManager
>>> is running?
>>>
>>> TIA
>>>
>>> Kim
>>>
>>>
>>>
>>>  On Wed, Apr 16, 2014 at 8:43 AM, Abdelrahman Shettia <
>>> [email protected]> wrote:
>>>
>>>>  Hi Kim,
>>>>
>>>> It looks like it is pointing to hdfs location. Can you create the hdfs
>>>> dir and put the jar there? Hope this helps
>>>> Thanks,
>>>> Rahman
>>>>
>>>> On Apr 16, 2014, at 8:39 AM, Rahul Singh <[email protected]>
>>>> wrote:
>>>>
>>>> any help...all are welcome?
>>>>
>>>>
>>>> On Wed, Apr 16, 2014 at 1:13 PM, Rahul Singh <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>  I am running with the following command but still, jar is not
>>>>> available to mapper and reducers.
>>>>>
>>>>> hadoop jar /home/hduser/workspace/Minerva.jar my.search.Minerva
>>>>> /user/hduser/input_minerva_actual /user/hduser/output_merva_actual3
>>>>> -libjars /home/hduser/Documents/Lib/json-simple-1.1.1.jar
>>>>> -Dmapreduce.user.classpath.first=true
>>>>>
>>>>>
>>>>> Error Log
>>>>>
>>>>> 14/04/16 13:08:37 INFO client.RMProxy: Connecting to ResourceManager
>>>>> at /0.0.0.0:8032
>>>>> 14/04/16 13:08:37 INFO client.RMProxy: Connecting to ResourceManager
>>>>> at /0.0.0.0:8032
>>>>> 14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line
>>>>> option parsing not performed. Implement the Tool interface and execute 
>>>>> your
>>>>> application with ToolRunner to remedy this.
>>>>> 14/04/16 13:08:37 INFO mapred.FileInputFormat: Total input paths to
>>>>> process : 1
>>>>> 14/04/16 13:08:37 INFO mapreduce.JobSubmitter: number of splits:10
>>>>> 14/04/16 13:08:37 INFO mapreduce.JobSubmitter: Submitting tokens for
>>>>> job: job_1397534064728_0028
>>>>> 14/04/16 13:08:38 INFO impl.YarnClientImpl: Submitted application
>>>>> application_1397534064728_0028
>>>>> 14/04/16 13:08:38 INFO mapreduce.Job: The url to track the job:
>>>>> http://L-Rahul-Tech:8088/proxy/application_1397534064728_0028/<http://l-rahul-tech:8088/proxy/application_1397534064728_0028/>
>>>>> 14/04/16 13:08:38 INFO mapreduce.Job: Running job:
>>>>> job_1397534064728_0028
>>>>> 14/04/16 13:08:47 INFO mapreduce.Job: Job job_1397534064728_0028
>>>>> running in uber mode : false
>>>>> 14/04/16 13:08:47 INFO mapreduce.Job:  map 0% reduce 0%
>>>>> 14/04/16 13:08:58 INFO mapreduce.Job: Task Id :
>>>>> attempt_1397534064728_0028_m_000005_0, Status : FAILED
>>>>> Error: java.lang.RuntimeException: Error in configuring object
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>>>>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>>>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>>>>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>     at javax.security.auth.Subject.doAs(Subject.java:416)
>>>>>     at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>>>>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>     at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>     at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>     at java.lang.reflect.Method.invoke(Method.java:622)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>>>>>     ... 9 more
>>>>> Caused by: java.lang.NoClassDefFoundError:
>>>>> org/json/simple/parser/ParseException
>>>>>     at java.lang.Class.forName0(Native Method)
>>>>>     at java.lang.Class.forName(Class.java:270)
>>>>>     at
>>>>> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1821)
>>>>>     at
>>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1786)
>>>>>     at
>>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1880)
>>>>>     at
>>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1906)
>>>>>     at
>>>>> org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:1107)
>>>>>     at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>>>>>     ... 14 more
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.json.simple.parser.ParseException
>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
>>>>>     ... 22 more
>>>>>
>>>>> When i analyzed the logs it says
>>>>> "14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line
>>>>> option parsing not performed. Implement the Tool interface and execute 
>>>>> your
>>>>> application with ToolRunner to remedy this."
>>>>>
>>>>> But i have implemented the tool class as described below:
>>>>>
>>>>> package my.search;
>>>>>
>>>>> import org.apache.hadoop.conf.Configured;
>>>>> import org.apache.hadoop.fs.Path;
>>>>> import org.apache.hadoop.io.Text;
>>>>> import org.apache.hadoop.mapred.FileInputFormat;
>>>>> import org.apache.hadoop.mapred.FileOutputFormat;
>>>>> import org.apache.hadoop.mapred.JobClient;
>>>>> import org.apache.hadoop.mapred.JobConf;
>>>>> import org.apache.hadoop.mapred.TextInputFormat;
>>>>> import org.apache.hadoop.mapred.TextOutputFormat;
>>>>> import org.apache.hadoop.util.Tool;
>>>>> import org.apache.hadoop.util.ToolRunner;
>>>>>
>>>>> public class Minerva extends Configured implements Tool
>>>>> {
>>>>>     public int run(String[] args) throws Exception {
>>>>>         JobConf conf = new JobConf(Minerva.class);
>>>>>         conf.setJobName("minerva sample job");
>>>>>
>>>>>         conf.setMapOutputKeyClass(Text.class);
>>>>>         conf.setMapOutputValueClass(TextArrayWritable.class);
>>>>>
>>>>>         conf.setOutputKeyClass(Text.class);
>>>>>         conf.setOutputValueClass(Text.class);
>>>>>
>>>>>         conf.setMapperClass(Map.class);
>>>>>         // conf.setCombinerClass(Reduce.class);
>>>>>         conf.setReducerClass(Reduce.class);
>>>>>
>>>>>         conf.setInputFormat(TextInputFormat.class);
>>>>>         conf.setOutputFormat(TextOutputFormat.class);
>>>>>
>>>>>         FileInputFormat.setInputPaths(conf, new Path(args[0]));
>>>>>         FileOutputFormat.setOutputPath(conf, new Path(args[1]));
>>>>>
>>>>>         JobClient.runJob(conf);
>>>>>
>>>>>         return 0;
>>>>>     }
>>>>>
>>>>>     public static void main(String[] args) throws Exception {
>>>>>         int res = ToolRunner.run(new Minerva(), args);
>>>>>         System.exit(res);
>>>>>     }
>>>>> }
>>>>>
>>>>>
>>>>> Please let me know if you see any issues?
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 10, 2014 at 9:29 AM, Shengjun Xin <[email protected]>wrote:
>>>>>
>>>>>> add '-Dmapreduce.user.classpath.first=true' to your command and try
>>>>>> again
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 9, 2014 at 6:27 AM, Kim Chew <[email protected]> wrote:
>>>>>>
>>>>>>> It seems to me that in Hadoop 2.2.1, using the "libjars" option does
>>>>>>> not search the jars located in the the local file system but HDFS. For
>>>>>>> example,
>>>>>>>
>>>>>>> hadoop jar target/myJar.jar Foo -libjars
>>>>>>> /home/kchew/test-libs/testJar.jar /user/kchew/inputs/raw.vector
>>>>>>> /user/kchew/outputs hdfs://remoteNN:8020 remoteJT:8021
>>>>>>>
>>>>>>> 14/04/08 15:11:02 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>>>>> processName=JobTracker, sessionId=
>>>>>>> 14/04/08 15:11:02 INFO mapreduce.JobSubmitter: Cleaning up the
>>>>>>> staging area
>>>>>>> file:/tmp/hadoop-kchew/mapred/staging/kchew202924688/.staging/job_local202924688_0001
>>>>>>> 14/04/08 15:11:02 ERROR security.UserGroupInformation:
>>>>>>> PriviledgedActionException as:kchew (auth:SIMPLE)
>>>>>>> cause:java.io.FileNotFoundException: File does not exist:
>>>>>>> hdfs://remoteNN:8020/home/kchew/test-libs/testJar.jar
>>>>>>> java.io.FileNotFoundException: File does not exist:
>>>>>>> hdfs:/remoteNN:8020/home/kchew/test-libs/testJar.jar
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1110)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>>>>>>>     at
>>>>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>>>     at
>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>>>>>>>     at
>>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>>>     at
>>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>>>     at
>>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
>>>>>>>     at
>>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>>>     at
>>>>>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:264)
>>>>>>>
>>>>>>> So under Hadoop 2.2.1, do I have to explicitly set some
>>>>>>> configurations so when using the "libjars" option it will copy the file 
>>>>>>> to
>>>>>>> hdfs from local fs?
>>>>>>>
>>>>>>> TIA
>>>>>>>
>>>>>>> Kim
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards
>>>>>> Shengjun
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>
>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: using "-libjars" in Hadoop 2.2.1

Reply via email to