[
https://issues.apache.org/jira/browse/NUTCH-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217512#comment-16217512
]
ASF GitHub Bot commented on NUTCH-2448:
---------------------------------------
sebastian-nagel closed pull request #232: NUTCH-2448: Treat white-space
http.agent.version as empty.
URL: https://github.com/apache/nutch/pull/232
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git
a/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
b/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
index 491d32b52..b8d2c6fa5 100644
---
a/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
+++
b/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
@@ -438,7 +438,7 @@ private static String getAgentString(String agentName,
String agentVersion,
StringBuffer buf = new StringBuffer();
buf.append(agentName);
- if (agentVersion != null) {
+ if (agentVersion != null && !agentVersion.trim().isEmpty()) {
buf.append("/");
buf.append(agentVersion);
}
@@ -604,4 +604,4 @@ public BaseRobotRules getRobotRules(Text url, CrawlDatum
datum,
}
return hm;
}
-}
\ No newline at end of file
+}
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Allow Sending an empty http.agent.version
> -----------------------------------------
>
> Key: NUTCH-2448
> URL: https://issues.apache.org/jira/browse/NUTCH-2448
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 1.13
> Reporter: Yossi Tamari
> Priority: Minor
>
> http.agent.version defaults in nutch-default.xml to Nutch-1.14-SNAPSHOT
> (depending on the version of course).
> If I want to override it to not send a version as part of the user-agent,
> there is nothing I can do in nutch-site.xml, since putting an empty string
> there causes the default to be taken, and putting any value there causes a
> slash to be appended to the http.agent.name.
> As far as I can see, the only way to override it is to remove the value in
> nutch-default.xml, which is probably not the “correct” way, considering it
> contains a comment saying “Do not modify this file directly”.
> The suggested solution is to treat a white-space-only value as empty.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)