Hi, No. I don't have that enabled. The logging level is INFO for others.
-Jiang On Fri, Jun 29, 2012 at 5:32 PM, Ferdy Galema <[email protected]> wrote: > A quick pointer: > > Do you have trace logging enabled? If so try to disabled and see if that > works. > See https://issues.apache.org/jira/browse/NUTCH-1253 > > > On Fri, Jun 29, 2012 at 11:17 AM, Jiang Fung Wong > <[email protected]>wrote: > >> Dear All, >> >> I have this scenario, where I need to initialize an HtmlUnit (a >> browser for scraping) web client inside a nutch plugin code. The code >> is (in clojure) >> >> (defn parser-filter >> "Called by nutch to perform the parsing. Implementation of >> org.apache.nutch.parse.HtmlParseFilter.filter" >> [this content parse-result meta-tags doc] >> >> (println "testing 123") >> >> (try >> >> (doto (new WebClient) >> (.setJavaScriptEnabled true) >> (.setThrowExceptionOnFailingStatusCode false) >> (.setThrowExceptionOnScriptError false)) >> >> >> (catch Exception e >> >> (println "caught") >> (throw e))) >> >> (println "ending testing 123") >> >> ................... >> >> >> WebClient class comes from [com.gargoylesoftware.htmlunit WebClient]. >> I believe it is an Apache's http client. I found that the program >> encountered exception inside the try block, yet the exception was not >> caught. >> >> >> The output from nutch: >> >> testing 123 >> Parsing: http://sg.news.yahoo.com/ >> Error parsing: http://sg.news.yahoo.com/: failed(2,200): >> org.apache.nutch.parse.ParseException: Unable to successfully parse >> content >> ParseSegment: finished at 2012-06-29 09:16:31, elapsed: 00:00:07 >> >> Neither "caught" nor "ending testing 123" was not printed out. >> >> Any idea? >> >> >> -Jiang >>

