Hi Matthias, You are the man!
Thank you very much. I have been spinning my wheels on this for quite a while now. Thanks again! J ----- Original Message ----- From: "Matthias Jaekle" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, September 10, 2004 12:31 AM Subject: Re: [Nutch-general] I'm Stumped - Finished Product Not working > Hi Jason, > > this looks like tomcat has not enough memory. Try to give java / tomcat > some more memory. > > I added to cathalina.sh (linux): > JAVA_OPTS=-Xmx256m > Then it works fine for me. > > Also I had to make > chmod -R a+rx index > > Bye > > Matthias > > > Jason Boss schrieb: > > > Ok here is what I did. > > > > I split 5 million pages over 8 segments. They are all indexed and I go > > to load them up, and 0-3 work fine. The only problem I had in 3 was > > when the fetcher seemed to stall due to the parsing of pdf files. When > > I load anything beyond 3 I get this: > > > > > > > > HTTP Status 500 - > > > > ------------------------------------------------------------------------ > > > > type Exception report > > > > message > > > > description The server encountered an internal error () that prevented > > it from fulfilling this request. > > > > exception > > > > javax.servlet.ServletException > > org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextI mpl.java:825) > > org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImp l.java:758) > > org.apache.jsp.search_jsp._jspService(search_jsp.java:495) > > org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94) > > javax.servlet.http.HttpServlet.service(HttpServlet.java:802) > > org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:3 24) > > org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292) > > org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236) > > javax.servlet.http.HttpServlet.service(HttpServlet.java:802) > > > > root cause > > > > java.lang.OutOfMemoryError > > > > note The full stack trace of the root cause is available in the Apache > > Tomcat/5.0.28 logs. > > > > ------------------------------------------------------------------------ > > > > > > Apache Tomcat/5.0.28 > > > > > > These are the segments that work: > > > > 19M segments/20040829092114/fetchlist > > 21M segments/20040829092114/fetcher > > 244M segments/20040829092114/content > > 82M segments/20040829092114/parse_text > > 143M segments/20040829092114/parse_data > > 119M segments/20040829092114/index > > 625M segments/20040829092114 > > 26M segments/20040829122947/fetchlist > > 32M segments/20040829122947/fetcher > > 646M segments/20040829122947/content > > 217M segments/20040829122947/parse_text > > 358M segments/20040829122947/parse_data > > 297M segments/20040829122947/index > > 1.6G segments/20040829122947 > > 70M segments/20040829124357/fetchlist > > 84M segments/20040829124357/fetcher > > 2.2G segments/20040829124357/content > > 811M segments/20040829124357/parse_text > > 1.5G segments/20040829124357/parse_data > > 1.1G segments/20040829124357/index > > 5.6G segments/20040829124357 > > 90M segments/20040829130541/fetchlist > > 102M segments/20040829130541/fetcher > > 977M segments/20040829130541/content > > 301M segments/20040829130541/parse_text > > 547M segments/20040829130541/parse_data > > 428M segments/20040829130541/index > > 2.4G segments/20040829130541 > > 28M segments/20040829212107/fetchlist > > 34M segments/20040829212107/fetcher > > 757M segments/20040829212107/content > > 256M segments/20040829212107/parse_text > > 502M segments/20040829212107/parse_data > > 344M segments/20040829212107/index > > 1.9G segments/20040829212107 > > 28M segments/20040829225928/fetchlist > > 33M segments/20040829225928/fetcher > > 694M segments/20040829225928/content > > 243M segments/20040829225928/parse_text > > 398M segments/20040829225928/parse_data > > 329M segments/20040829225928/index > > 1.7G segments/20040829225928 > > 29M segments/20040830042947/fetchlist > > 35M segments/20040830042947/fetcher > > 890M segments/20040830042947/content > > 292M segments/20040830042947/parse_text > > 673M segments/20040830042947/parse_data > > 376M segments/20040830042947/index > > 2.3G segments/20040830042947 > > 29M segments/20040830043001/fetchlist > > 35M segments/20040830043001/fetcher > > 893M segments/20040830043001/content > > 293M segments/20040830043001/parse_text > > 678M segments/20040830043001/parse_data > > 377M segments/20040830043001/index > > 2.3G segments/20040830043001 > > 29M segments/20040830065943/fetchlist > > 35M segments/20040830065943/fetcher > > 872M segments/20040830065943/content > > 292M segments/20040830065943/parse_text > > 661M segments/20040830065943/parse_data > > 377M segments/20040830065943/index > > 2.3G segments/20040830065943 > > 154M segments/20040830111830/fetchlist > > 183M segments/20040830111830/fetcher > > 5.1G segments/20040830111830/content > > 1.7G segments/20040830111830/parse_text > > 3.9G segments/20040830111830/parse_data > > 2.2G segments/20040830111830/index > > 13G segments/20040830111830 > > 12M segments/20040904035557-0/fetchlist > > 15M segments/20040904035557-0/fetcher > > 316M segments/20040904035557-0/content > > 122M segments/20040904035557-0/parse_text > > 203M segments/20040904035557-0/parse_data > > 163M segments/20040904035557-0/index > > 829M segments/20040904035557-0 > > 13M segments/20040904035557-1/fetchlist > > 16M segments/20040904035557-1/fetcher > > 333M segments/20040904035557-1/content > > 132M segments/20040904035557-1/parse_text > > 210M segments/20040904035557-1/parse_data > > 177M segments/20040904035557-1/index > > 878M segments/20040904035557-1 > > 12M segments/20040904035557-2/fetchlist > > 15M segments/20040904035557-2/fetcher > > 328M segments/20040904035557-2/content > > 126M segments/20040904035557-2/parse_text > > 207M segments/20040904035557-2/parse_data > > 170M segments/20040904035557-2/index > > 855M segments/20040904035557-2 > > 73M segments/20040905133243-0/fetchlist > > 87M segments/20040905133243-0/fetcher > > 2.0G segments/20040905133243-0/content > > 762M segments/20040905133243-0/parse_text > > 1.4G segments/20040905133243-0/parse_data > > 974M segments/20040905133243-0/index > > 5.3G segments/20040905133243-0 > > 73M segments/20040905133243-1/fetchlist > > 87M segments/20040905133243-1/fetcher > > 2.0G segments/20040905133243-1/content > > 746M segments/20040905133243-1/parse_text > > 1.4G segments/20040905133243-1/parse_data > > 958M segments/20040905133243-1/index > > 5.2G segments/20040905133243-1 > > 72M segments/20040905133243-2/fetchlist > > 86M segments/20040905133243-2/fetcher > > 1.9G segments/20040905133243-2/content > > 722M segments/20040905133243-2/parse_text > > 1.4G segments/20040905133243-2/parse_data > > 935M segments/20040905133243-2/index > > 5.0G segments/20040905133243-2 > > 71M segments/20040905133243-3/fetchlist > > 84M segments/20040905133243-3/fetcher > > 1.9G segments/20040905133243-3/content > > 727M segments/20040905133243-3/parse_text > > 1.3G segments/20040905133243-3/parse_data > > 926M segments/20040905133243-3/index > > 5.0G segments/20040905133243-3 > > 57G segments > > When I merge this, the index directory is right at 10 gigs. > > > > Are there limits to the /index directory size? Am I missing something > > simple? > > > > Thanks! > > > > J > > > -- > http://gmbh.eventax.de - eventax GmbH > http://www.umkreisfinder.de - Die Suchmaschine f�r Lokales und Events > http://www.fahnen-drucken.de - Flaggen einfach selbst gemacht > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > Project Admins to receive an Apple iPod Mini FREE for your judgement on > who ports your project to Linux PPC the best. Sponsored by IBM. > Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php > _______________________________________________ > Nutch-general mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/nutch-general ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php _______________________________________________ Nutch-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-general
