Nutch spidered one of our sites last night and when it encountered a URL that contained a space character it would ignore everything after the space which caused our application to fail with the resulting URL it attempted to access.
Example URL that should have been requested: http://www.apache.org/cgi-bin/view?status=A%20&id=1 What Nutch then tried to access: http://www.apache.org/cgi-bin/view?status=A Please investigate. Thanks, Rick Flosi