Re: Solving "heap size error"

Mahmood Naderan Thu, 13 Mar 2014 12:03:28 -0700

We used the wikipedia splitter as a benchmark for our simulation on hadoop 0.2. 
I am now trying to run that on the latest hadoop to be up to date and check 
some differences. For now, I have no other choice.



 
Regards,
Mahmood



On Thursday, March 13, 2014 10:12 PM, Andrew Musselman 
<[email protected]> wrote:
 
What's your larger goal here; are you putting Hadoop and Mahout through
paces as an exercise?

If your process is blowing through data quickly up to a certain point there
may be something happening with a common value, which is a "data bug".  I
don't know what this wikipedia splitter class does but if you're interested
in isolating the issue you could find out what is happening data-wise and
see if there is some very large grouping on a pathologically frequent key
for instance.



On Thu, Mar 13, 2014 at 11:31 AM, Mahmood Naderan <[email protected]>wrote:

> I am pretty sure that there is something wrong with hadoop/mahout/java.
> With any configuration, it stuck at the chunk #571. Previous chunks are
> created rapidly but I see it waits for bout 30 minutes on 571 and that is
> the reason for heap error size.
>
> I will try to submit a bug report.
>
>
> Regards,
> Mahmood
>
>
>
> On Thursday, March 13, 2014 2:31 PM, Mahmood Naderan <[email protected]>
> wrote:
>
> Strange thing is that if I use either -Xmx128m of -Xmx16384m the process
> stops at the chunk #571 (571*64=36.5GB).
> Still I haven't figured out is this a problem with JVM or Hadoop or Mahout?
>
> I have tested various parameters on 16GB RAM
>
>
> <property>
> <name>mapred.map.child.java.opts</name>
> <value>-Xmx2048m</value>
>
> </property>
> <property>
> <name>mapred.reduce.child.java.opts</name>
> <value>-Xmx4096m</value>
>
> </property>
>
> Is there an relation between the parameters and the amount of available
> memory?
> I also see a HADOOP_HEAPSIZE in hadoop-env.sh which is commented by
> default. What is that?
>
> Regards,
> Mahmood
>
>
>
> On Tuesday, March 11, 2014 11:57 PM, Mahmood Naderan <[email protected]>
> wrote:
>
> As I posted earlier, here is the result of a successful test
>
> 5.4GB XML file (which is larger than enwiki-latest-pages-articles10.xml)
> with 4GB of RAM and -Xmx128m tooks 5 minutes to complete.
>
> I didn't find a larger wikipedia XML file. Need
>  to test 10GB, 20GB and 30GB files
>
>
>
> Regards,
> Mahmood
>
>
>
>
> On Tuesday, March 11, 2014 11:41 PM, Andrew Musselman <
> [email protected]> wrote:
>
> Can you please try running this on a smaller file first, per Suneel's
> comment a while back:
>
> "Please first try running this on a smaller dataset like
> 'enwiki-latest-pages-articles10.xml' as opposed to running on the entire
> english wikipedia."
>
>
>
> On Tue, Mar 11, 2014 at 12:56 PM, Mahmood Naderan <[email protected]
> >wrote:
>
> > Hi,
> > Recently I have faced a heap size error when I run
> >
> >   $MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d
> >
> $MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o
> > wikipedia/chunks -c 64
> >
> > Here is the specs
> > 1- XML file size = 44GB
> > 2- System memory = 54GB (on virtualbox)
> > 3- Heap size = 51GB (-Xmx51000m)
> >
> > At the time of failure, I see that 571 chunks are created (hadoop dfs
> -ls)
> > so 36GB of the original file has been processed. Now here are my
> questions
> >
> > 1- Is there any way to
>  resume the process? As stated before, 571 chunks
> > have been created. So by resuming, it can create the rest of the chunks
> > (572~).
> >
> > 2- Is it possible to parallelize the process? Assume, 100GB of heap is
> > required to process the XML file and my system cannot
> afford that. Then we
> > can create 20 threads each requires 5GB of heap. Next by feeding the
> first
> > 10 threads we can use the available 50GB of heap and after completion, we
> > can feed the next set of threads.
> >
> >
> > Regards,
> > Mahmood
>

Re: Solving "heap size error"

Reply via email to