My fault, sorry...
I did several tests with Perl5Compiler and XML. 1. \s can be used, <pattern>\s</pattern> (although Java doesn't allow unescaped [String s = "\s";]) 2. Whitespace can be used, <pattern> </pattern> And, check Nutch configuration; ensure Normalizer is properly configured. You can use + (plus) sign instead of %20 in URLs.
