Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JerryRussell: http://wiki.apache.org/nutch/bin/nutch_dedup The comment on the change is: fixed classpath to org.apache ------------------------------------------------------------------------------ - dedup is an alias for net.nutch.indexer.!DeleteDuplicates + dedup is an alias for org.apache.nutch.indexer.!DeleteDuplicates Deletes duplicate documents in a set of Lucene indexes. Duplicates have either the same contents (via MD5 hash) or the same URL. - Usage: bin/nutch net.nutch.indexer.!DeleteDuplicates (-local | -ndfs <namenode:port>) [-workingdir <workingdir>] <segmentsDir> + Usage: bin/nutch org.apache.nutch.indexer.!DeleteDuplicates (-local | -ndfs <namenode:port>) [-workingdir <workingdir>] <segmentsDir> CommandLineOptions
