[ https://issues.apache.org/jira/browse/NUTCH-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473990 ]
Sami Siren commented on NUTCH-247: ---------------------------------- Agent name has actually only relevance in http. IMO not setting agent name should only block fetching in protocol http. > robot parser to restrict. > ------------------------- > > Key: NUTCH-247 > URL: https://issues.apache.org/jira/browse/NUTCH-247 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.8 > Reporter: Stefan Groschupf > Assigned To: Dennis Kubes > Priority: Minor > Fix For: 0.9.0 > > Attachments: agent-names.patch > > > If the agent name and the robots agents are not proper configure the Robot > rule parser uses LOG.severe to log the problem but solve it also. > Later on the fetcher thread checks for severe errors and stop if there is one. > RobotRulesParser: > if (agents.size() == 0) { > agents.add(agentName); > LOG.severe("No agents listed in 'http.robots.agents' property!"); > } else if (!((String)agents.get(0)).equalsIgnoreCase(agentName)) { > agents.add(0, agentName); > LOG.severe("Agent we advertise (" + agentName > + ") not listed first in 'http.robots.agents' property!"); > } > Fetcher.FetcherThread: > if (LogFormatter.hasLoggedSevere()) // something bad happened > break; > I suggest to use warn or something similar instead of severe to log this > problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers