bin/nutch -Xss1024k index crawl1/indexes crawl1/crawldb crawl1/linkdb crawl/segments/* Exception in thread "main" java.lang.NoClassDefFoundError: index Caused by: java.lang.ClassNotFoundException: index at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) Could not find the main class: index. Program will exit.
Do you have to set the -Xss flag somewhere else? Thanks, Eric On Jan 11, 2010, at 8:36 AM, Godmar Back wrote: > Very intriguing, considering that we teach our students to avoid > recursion where possible for this very reason. > > Googling reveals > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4675952 and > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5050507 so you > could try increasing the Java stack size in bin/nutch (-Xss), or use > an alternate regexp if you can. > > Just out of curiosity, why does a performance critical program such as > Nutch use Sun's backtracking-based regexp implementation rather than > an efficient Thompson-based one? Do you need the additional > expressiveness provided by PCRE? > > - Godmar > > On Mon, Jan 11, 2010 at 11:24 AM, Eric Osgood <e...@lakemeadonline.com> wrote: >> During a crawl of about 3.8M tlds to a depth of 2, when I try to index the >> segments, I get the following error: >> >> java.lang.StackOverflowError >> at java.util.regex.Pattern$Loop.match(Pattern.java:4295) >> Any help with this error would be much appreciated, I have encountered this >> before. >> >> here is the last 10 lines of the hadoop.log file: >> >> tail -n 10 hadoop.log.2010-01-10 >> at java.util.regex.Pattern$GroupTail.match(Pattern.java:4227) >> at java.util.regex.Pattern$BranchConn.match(Pattern.java:4078) >> at java.util.regex.Pattern$Ques.match(Pattern.java:3691) >> at java.util.regex.Pattern$Branch.match(Pattern.java:4114) >> at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168) >> at java.util.regex.Pattern$Loop.match(Pattern.java:4295) >> at java.util.regex.Pattern$GroupTail.match(Pattern.java:4227) >> at java.util.regex.Pattern$BranchConn.match(Pattern.java:4078) >> at java.util.regex.Pattern$Ques.match(Pattern.java:3691) >> 2010-01-11 00:31:53,221 WARN io.UTF8 - truncating long string: 62492 chars, >> starting with java.lang.StackOverf >> >> >> >> Eric Osgood >> --------------------------------------------- >> Cal Poly - Computer Engineering, Moon Valley Software >> --------------------------------------------- >> eosg...@calpoly.edu, e...@lakemeadonline.com >> --------------------------------------------- >> www.calpoly.edu/~eosgood, www.lakemeadonline.com >> >> Eric Osgood --------------------------------------------- Cal Poly - Computer Engineering, Moon Valley Software --------------------------------------------- eosg...@calpoly.edu, e...@lakemeadonline.com --------------------------------------------- www.calpoly.edu/~eosgood, www.lakemeadonline.com