Author: markus Date: Tue Jan 12 10:33:59 2016 New Revision: 1724199 URL: http://svn.apache.org/viewvc?rev=1724199&view=rev Log: NUTCH-2190 Protocol normalizer
Added: nutch/trunk/conf/protocols.txt Added: nutch/trunk/conf/protocols.txt URL: http://svn.apache.org/viewvc/nutch/trunk/conf/protocols.txt?rev=1724199&view=auto ============================================================================== --- nutch/trunk/conf/protocols.txt (added) +++ nutch/trunk/conf/protocols.txt Tue Jan 12 10:33:59 2016 @@ -0,0 +1,7 @@ +# Example configuration file for urlnormalizer-protocol +# +# URL's of hosts listed in the configuration are normalized to the target +# protocol. Useful in cases where a host accepts both http and https, doubling +# the site's size. +# +# format: <host>\t<protocol>\n