[ https://issues.apache.org/jira/browse/NUTCH-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061073#comment-15061073 ]
Hudson commented on NUTCH-2182: ------------------------------- SUCCESS: Integrated in Nutch-trunk #3329 (See [https://builds.apache.org/job/Nutch-trunk/3329/]) NUTCH-2182 Make reverseUrlDirs file dumper option hash the URL for consistency (joyce: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1720466]) * trunk/CHANGES.txt * trunk/src/java/org/apache/nutch/tools/FileDumper.java > Make reverseUrlDirs file dumper option hash the URL for consistency > ------------------------------------------------------------------- > > Key: NUTCH-2182 > URL: https://issues.apache.org/jira/browse/NUTCH-2182 > Project: Nutch > Issue Type: Improvement > Components: tool > Affects Versions: 1.11 > Reporter: Michael Joyce > Assignee: Michael Joyce > Fix For: 1.12 > > Attachments: NUTCH-2182_joyce_8Dec2015.patch > > > At the moment the "reverseUrlDirs" option for FileDumper is terribly brittle > and fails on a fair number of edge cases. A more robust way to handle the > reverse URL approach to dumping a file is to reverse the server part and hash > the URL to use as the file name. This gives us a nice split of files while > avoiding a number of likely classes that causes dumps to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)