I need to inject a number of seed URLs that require a # sign in the parameters.
Both the basic-normalizer and the regex-normalizer (with a default configuration) would seem to remove the portion of the URL after the # I can modify the regex normalizer configuration file to eliminate this behavior, but I cannot find a way to do this for the basic normalizer without modifying the source. I have tried encoding the # sign as %23 but this does not work with the site I am crawling - it appears to need the #. Any other suggestions? If I don't invoke the basic normalize does anyone have the additional entries for the regex normalizer config file that would replace its functionality? Thanks

