[ https://issues.apache.org/jira/browse/NUTCH-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546499#comment-16546499 ]
Hudson commented on NUTCH-1106: ------------------------------- SUCCESS: Integrated in Jenkins build Nutch-trunk #3546 (See [https://builds.apache.org/job/Nutch-trunk/3546/]) NUTCH-1106 Options to skip url's based on length - add property (snagel: [https://github.com/apache/nutch/commit/579a76beb17aaa28a7dd68d6c4fe0d3e2c80bb51]) * (edit) conf/regex-urlfilter.txt.template * (edit) src/java/org/apache/nutch/parse/ParseOutputFormat.java * (edit) conf/nutch-default.xml * (edit) src/java/org/apache/nutch/fetcher/FetcherThread.java NUTCH-1106 Options to skip url's based on length - most browsers support (snagel: [https://github.com/apache/nutch/commit/8d434b5a43f997736ffcabefe83b31c780b7495c]) * (edit) src/java/org/apache/nutch/parse/ParseOutputFormat.java * (edit) conf/nutch-default.xml * (edit) src/java/org/apache/nutch/fetcher/FetcherThread.java * (edit) conf/regex-urlfilter.txt.template > Options to skip url's based on length > ------------------------------------- > > Key: NUTCH-1106 > URL: https://issues.apache.org/jira/browse/NUTCH-1106 > Project: Nutch > Issue Type: Improvement > Components: linkdb > Affects Versions: 1.3 > Reporter: Markus Jelsma > Assignee: Sebastian Nagel > Priority: Major > Fix For: 1.15 > > Attachments: NUTCH-1106-1.4-1.patch > > > Adds option to skip URL's exceeding a certain length. At first we used regex > to impose this limit but having this options configurable is more convenient. > Comments? -- This message was sent by Atlassian JIRA (v7.6.3#76005)