I've tested it with some part of the whole file.
I've tried different values of -Xmx parameter in hadoop mapred config file.
It worked with -Xmx= ~30000.
I don't know why it worked. I've tested it on vm with 3GB of ram.




On Thu, Mar 13, 2014 at 8:19 PM, mahmood (JIRA) <[email protected]> wrote:

>
>      [
> https://issues.apache.org/jira/browse/MAHOUT-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
> mahmood updated MAHOUT-1456:
> ----------------------------
>
>     Description:
> 1- The XML file is
> http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
> 2- When I run "mahout wikipediaXMLSplitter -d
> enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64", it stuck at
> chunk #571 and after 30 minutes it fails to continue with the java heap
> size error. Previous chunks are created rapidly (10 chunks per second).
> 3- Increasing the heap size via "-Xmx4096m" option doesn't work.
> 4- No matter what is the configuration, it seems that there is a memory
> leak that eat all space.
>
>   was:
> 1- The XML file is
> http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
> 2- When I run "mahout wikipediaXMLSplitter -d enwiki-latest-pages-articles
> -o wikipedia/chunks -c 64", it stuck at chunk #571 and after 30 minutes it
> fails to continue with the java heap size error. Previous chunks are
> created rapidly (10 chunks per second).
> 3- Increasing the heap size via "-Xmx4096m" option doesn't work.
> 4- No matter what is the configuration, it seems that there is a memory
> leak that eat all space.
>
>
> > The wikipediaXMLSplitter example fails with "heap size" error
> > -------------------------------------------------------------
> >
> >                 Key: MAHOUT-1456
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-1456
> >             Project: Mahout
> >          Issue Type: Bug
> >          Components: Examples
> >    Affects Versions: 0.9
> >         Environment: Solaris 11.1 \
> > Hadoop 2.3.0 \
> > Maven 3.2.1 \
> > JDK 1.7.0_07-b10 \
> >            Reporter: mahmood
> >              Labels: Heap,, mahout,, wikipediaXMLSplitter
> >
> > 1- The XML file is
> http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
> > 2- When I run "mahout wikipediaXMLSplitter -d
> enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64", it stuck at
> chunk #571 and after 30 minutes it fails to continue with the java heap
> size error. Previous chunks are created rapidly (10 chunks per second).
> > 3- Increasing the heap size via "-Xmx4096m" option doesn't work.
> > 4- No matter what is the configuration, it seems that there is a memory
> leak that eat all space.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)
>

Reply via email to