How big is this crawl you are doing? What version of nutch?
Jason On Wed, Jun 11, 2008 at 8:32 PM, Siddhartha Reddy <[EMAIL PROTECTED]> wrote: > Hi, > > While parsing some pages, I am getting a java.lang.StackOverflowError > exception due to the recursion in HTMLMetaProcessor.getMetaTagsHelper. I'm > pasting part of the stack trace below. Unfortunately, I've logic that > deletes the segment if fetch/parse fails, so I do not know which particular > web page caused this problem; I'll recrawl the same pages with modified > logic (that does not delete the segment on failed parsing) and try to find > the offending URL. > > Did anyone encounter such a problem before? Apart from increasing the stack > size for Java, is there any other possible solution? > > java.lang.StackOverflowError > at java.lang.Character.toUpperCase(Character.java:4278) > at java.lang.String.regionMatches(String.java:1384) > at java.lang.String.equalsIgnoreCase(String.java:1120) > at > org.apache.nutch.parse.html.HTMLMetaProcessor.getMetaTagsHelper(HTMLMetaProcessor.java:55) > at > org.apache.nutch.parse.html.HTMLMetaProcessor.getMetaTagsHelper(HTMLMetaProcessor.java:208) > at > org.apache.nutch.parse.html.HTMLMetaProcessor.getMetaTagsHelper(HTMLMetaProcessor.java:208) > at > org.apache.nutch.parse.html.HTMLMetaProcessor.getMetaTagsHelper(HTMLMetaProcessor.java:208) > at > org.apache.nutch.parse.html.HTMLMetaProcessor.getMetaTagsHelper(HTMLMetaProcessor.java:208) > at > org.apache.nutch.parse.html.HTMLMetaProcessor.getMetaTagsHelper(HTMLMetaProcessor.java:208) > .... > > Thanks, > Siddhartha >
