Just took a look at the bin/hadoop of your particular version (http://svn.apache.org/viewvc/hadoop/common/tags/release-0.19.2/bin/hadoop?revision=796970&view=markup). It looks like that HADOOP_CLIENT_OPTS doesn't work with the jar command, which is fixed in later version.
So try HADOOP_OPTS=-Xmx1000M bin/hadoop ... instead. It would work because it just translates to the same java command line that worked for you :) __Luke On Wed, Oct 13, 2010 at 4:18 PM, Shi Yu <[email protected]> wrote: > Hi, I tried the following five ways: > > Approach 1: in command line > HADOOP_CLIENT_OPTS=-Xmx4000m bin/hadoop jar WordCount.jar OOloadtest > > > Approach 2: I added the hadoop-site.xml file with the following element. > Each time I changed, I stop and restart hadoop on all the nodes. > ... > <property> > <name>HADOOP_CLIENT_OPTS</name> > <value>-Xmx4000m</value> > </property> > > run the command > $bin/hadoop jar WordCount.jar OOloadtest > > Approach 3: I changed like this > ... > <property> > <name>HADOOP_CLIENT_OPTS</name> > <value>4000m</value> > </property> > .... > > Then run the command: > $bin/hadoop jar WordCount.jar OOloadtest > > Approach 4: To make sure, I changed the "m" to numbers, that was > ... > <property> > <name>HADOOP_CLIENT_OPTS</name> > <value>4000000000</value> > </property> > .... > > Then run the command: > $bin/hadoop jar WordCount.jar OOloadtest > > All these four approaches come to the same "Java heap space" error. > > java.lang.OutOfMemoryError: Java heap space > at > java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:45) > at java.lang.StringBuilder.<init>(StringBuilder.java:68) > at > java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:2997) > at > java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2818) > at java.io.ObjectInputStream.readString(ObjectInputStream.java:1599) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1320) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) > at java.util.HashMap.readObject(HashMap.java:1028) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1846) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) > at ObjectManager.loadObject(ObjectManager.java:42) > at OOloadtest.main(OOloadtest.java:21) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:165) > at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) > > > Approach 5: > In comparison, I called the Java command directly as follows (there is a > counter showing how much time it costs if the serialized object is > successfully loaded): > > $java -Xms3G -Xmx3G -classpath > .:WordCount.jar:hadoop-0.19.2-core.jar:lib/log4j-1.2.15.jar OOloadtest > > return: > object loaded, timing (hms): 0 hour(s) 1 minute(s) 12 second(s) > 162millisecond(s) > > > What was the problem in my command? Where can I find the documentation about > HADOOP_CLIENT_OPTS? Have you tried the same thing and found it works? > > Shi > > > On 2010-10-13 16:28, Luke Lu wrote: >> >> On Wed, Oct 13, 2010 at 2:21 PM, Shi Yu<[email protected]> wrote: >> >>> >>> Hi, thanks for the advice. I tried with your settings, >>> $ bin/hadoop jar Test.jar OOloadtest -D HADOOP_CLIENT_OPTS=-Xmx4000m >>> still no effect. Or this is a system variable? Should I export it? How to >>> configure it? >>> >> >> HADOOP_CLIENT_OPTS is an environment variable so you should run it as >> HADOOP_CLIENT_OPTS=-Xmx1000m bin/hadoop jar Test.jar OOloadtest >> >> if you use sh derivative shells (bash, ksh etc.) prepend env for other >> shells. >> >> __Luke >> >> >> >>> >>> Shi >>> >>> java -Xms3G -Xmx3G -classpath >>> >>> .:WordCount.jar:hadoop-0.19.2-core.jar:lib/log4j-1.2.15.jar:lib/commons-collections-3.2.1.jar:lib/stanford-postagger-2010-05-26.jar >>> OOloadtest >>> >>> >>> On 2010-10-13 15:28, Luke Lu wrote: >>> >>>> >>>> On Wed, Oct 13, 2010 at 12:27 PM, Shi Yu<[email protected]> wrote: >>>> >>>> >>>>> >>>>> I haven't implemented anything in map/reduce yet for this issue. I just >>>>> try >>>>> to invoke the same java class using bin/hadoop command. The thing >>>>> is >>>>> a >>>>> very simple program could be executed in Java, but not doable in >>>>> bin/hadoop >>>>> command. >>>>> >>>>> >>>> >>>> If you are just trying to use bin/hadoop jar your.jar command, your >>>> code runs in a local client jvm and mapred.child.java.opts has no >>>> effect. You should run it with HADOOP_CLIENT_OPTS=-Xmx1000m bin/hadoop >>>> jar your.jar >>>> >>>> >>>> >>>>> >>>>> I think if I couldn't get through the first stage, even I had a >>>>> map/reduce program it would also fail. I am using Hadoop 0.19.2. >>>>> Thanks. >>>>> >>>>> Best Regards, >>>>> >>>>> Shi >>>>> >>>>> On 2010-10-13 14:15, Luke Lu wrote: >>>>> >>>>> >>>>>> >>>>>> Can you post your mapper/reducer implementation? or are you using >>>>>> hadoop streaming? for which mapred.child.java.opts doesn't apply to >>>>>> the jvm you care about. BTW, what's the hadoop version you're using? >>>>>> >>>>>> On Wed, Oct 13, 2010 at 11:45 AM, Shi Yu<[email protected]> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Here is my code. There is no Map/Reduce in it. I could run this code >>>>>>> using >>>>>>> java -Xmx1000m , however, when using bin/hadoop -D >>>>>>> mapred.child.java.opts=-Xmx3000M it has heap space not enough >>>>>>> error. >>>>>>> I >>>>>>> have tried other program in Hadoop with the same settings so the >>>>>>> memory >>>>>>> is >>>>>>> available in my machines. >>>>>>> >>>>>>> >>>>>>> public static void main(String[] args) { >>>>>>> try{ >>>>>>> String myFile = "xxx.dat"; >>>>>>> FileInputStream fin = new FileInputStream(myFile); >>>>>>> ois = new ObjectInputStream(fin); >>>>>>> margintagMap = ois.readObject(); >>>>>>> ois.close(); >>>>>>> fin.close(); >>>>>>> }catch(Exception e){ >>>>>>> // >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> On 2010-10-13 13:30, Luke Lu wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Wed, Oct 13, 2010 at 8:04 AM, Shi Yu<[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> As a coming-up to the my own question, I think to invoke the JVM in >>>>>>>>> Hadoop >>>>>>>>> requires much more memory than an ordinary JVM. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> That's simply not true. The default mapreduce task Xmx is 200M, >>>>>>>> which >>>>>>>> is much smaller than the standard jvm default 512M and most users >>>>>>>> don't need to increase it. Please post the code reading the object >>>>>>>> (in >>>>>>>> hdfs?) in your tasks. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> I found that instead of >>>>>>>>> serialization the object, maybe I could create a MapFile as an >>>>>>>>> index >>>>>>>>> to >>>>>>>>> permit lookups by key in Hadoop. I have also compared the >>>>>>>>> performance >>>>>>>>> of >>>>>>>>> MongoDB and Memcache. I will let you know the result after I try >>>>>>>>> the >>>>>>>>> MapFile >>>>>>>>> approach. >>>>>>>>> >>>>>>>>> Shi >>>>>>>>> >>>>>>>>> On 2010-10-12 21:59, M. C. Srivas wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Oct 12, 2010 at 4:50 AM, Shi Yu<[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I want to load a serialized HashMap object in hadoop. The file >>>>>>>>>>>> of >>>>>>>>>>>> stored >>>>>>>>>>>> object is 200M. I could read that object efficiently in JAVA by >>>>>>>>>>>> setting >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Xmx >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> as 1000M. However, in hadoop I could never load it into memory. >>>>>>>>>>>> The >>>>>>>>>>>> code >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> is >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> very simple (just read the ObjectInputStream) and there is yet >>>>>>>>>>>> no >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> map/reduce >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> implemented. I set the mapred.child.java.opts=-Xmx3000M, still >>>>>>>>>>>> get >>>>>>>>>>>> the >>>>>>>>>>>> "java.lang.OutOfMemoryError: Java heap space" Could anyone >>>>>>>>>>>> explain >>>>>>>>>>>> a >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> little >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> bit how memory is allocate to JVM in hadoop. Why hadoop takes up >>>>>>>>>>>> so >>>>>>>>>>>> much >>>>>>>>>>>> memory? If a program requires 1G memory on a single node, how >>>>>>>>>>>> much >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> memory >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> it requires (generally) in Hadoop? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The JVM reserves swap space in advance, at the time of launching >>>>>>>>>> the >>>>>>>>>> process. If your swap is too low (or do not have any swap >>>>>>>>>> configured), >>>>>>>>>> you >>>>>>>>>> will hit this. >>>>>>>>>> >>>>>>>>>> Or, you are on a 32-bit machine, in which case 3G is not possible >>>>>>>>>> in >>>>>>>>>> the >>>>>>>>>> JVM. >>>>>>>>>> >>>>>>>>>> -Srivas. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> Shi >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Postdoctoral Scholar >>>>>>> Institute for Genomics and Systems Biology >>>>>>> Department of Medicine, the University of Chicago >>>>>>> Knapp Center for Biomedical Discovery >>>>>>> 900 E. 57th St. Room 10148 >>>>>>> Chicago, IL 60637, US >>>>>>> Tel: 773-702-6799 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> Postdoctoral Scholar >>>>> Institute for Genomics and Systems Biology >>>>> Department of Medicine, the University of Chicago >>>>> Knapp Center for Biomedical Discovery >>>>> 900 E. 57th St. Room 10148 >>>>> Chicago, IL 60637, US >>>>> Tel: 773-702-6799 >>>>> >>>>> >>>>> >>>>> >>> >>> >>> > > >
