bin/nutch -Xss1024k index crawl1/indexes crawl1/crawldb crawl1/linkdb 
crawl/segments/*
Exception in thread "main" java.lang.NoClassDefFoundError: index
Caused by: java.lang.ClassNotFoundException: index
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
Could not find the main class: index.  Program will exit.

Do you have to set the -Xss flag somewhere else?

Thanks, 

Eric 

On Jan 11, 2010, at 8:36 AM, Godmar Back wrote:

> Very intriguing, considering that we teach our students to avoid
> recursion where possible for this very reason.
> 
> Googling reveals
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4675952 and
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5050507 so you
> could try increasing the Java stack size in bin/nutch (-Xss), or use
> an alternate regexp if you can.
> 
> Just out of curiosity, why does a performance critical program such as
> Nutch use Sun's backtracking-based regexp implementation rather than
> an efficient Thompson-based one?  Do you need the additional
> expressiveness provided by PCRE?
> 
> - Godmar
> 
> On Mon, Jan 11, 2010 at 11:24 AM, Eric Osgood <e...@lakemeadonline.com> wrote:
>> During a crawl of about 3.8M tlds to a depth of 2, when I try to index the 
>> segments, I get the following error:
>> 
>> java.lang.StackOverflowError
>>        at java.util.regex.Pattern$Loop.match(Pattern.java:4295)
>> Any help with this error would be much appreciated, I have encountered this 
>> before.
>> 
>> here is the last 10 lines of the hadoop.log file:
>> 
>> tail -n 10 hadoop.log.2010-01-10
>>        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4227)
>>        at java.util.regex.Pattern$BranchConn.match(Pattern.java:4078)
>>        at java.util.regex.Pattern$Ques.match(Pattern.java:3691)
>>        at java.util.regex.Pattern$Branch.match(Pattern.java:4114)
>>        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168)
>>        at java.util.regex.Pattern$Loop.match(Pattern.java:4295)
>>        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4227)
>>        at java.util.regex.Pattern$BranchConn.match(Pattern.java:4078)
>>        at java.util.regex.Pattern$Ques.match(Pattern.java:3691)
>> 2010-01-11 00:31:53,221 WARN  io.UTF8 - truncating long string: 62492 chars, 
>> starting with java.lang.StackOverf
>> 
>> 
>> 
>> Eric Osgood
>> ---------------------------------------------
>> Cal Poly - Computer Engineering, Moon Valley Software
>> ---------------------------------------------
>> eosg...@calpoly.edu, e...@lakemeadonline.com
>> ---------------------------------------------
>> www.calpoly.edu/~eosgood, www.lakemeadonline.com
>> 
>> 

Eric Osgood
---------------------------------------------
Cal Poly - Computer Engineering, Moon Valley Software
---------------------------------------------
eosg...@calpoly.edu, e...@lakemeadonline.com
---------------------------------------------
www.calpoly.edu/~eosgood, www.lakemeadonline.com

Reply via email to