[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501982
 ] 

Roland Weber commented on HTTPCLIENT-655:
-----------------------------------------

Hi Odi,

a) I don't think we should make significant changes to the User-Agent header in 
the 3.1 code base, like dropping Jakarta from it. People may have set up filter 
rules that rely on the name. That is also the reason why I'm not sure about 
changing anything but the version indicator at all. Since it's an RFC 
violation, we might change the space character to a dash. Btw, section 3.8 of 
RFC 2616 also mentions:
[quote]
  successive versions of the same product SHOULD only differ in the 
product-version portion of the product value
[/quote]
What is the lesser evil here?

b) Dropping Jakarta for the 4.0 code base is fine. What I don't like are calls 
to System.getProperty() to collect a user agent string, at least not in the 
default User-Agent interceptor. We can have a selection of them of course. Like 
one that says Apache-HttpCore/J-4.0-a5 in core and another one that says 
Apache-HttpClient/J-4.0-a1 in client. And another one that collects values from 
system properties.
(I'd like to see the version number being updated by the build process, but I 
don't have the time nor inclination to learn Maven2...)

c) You suggestion also generates space characters in "(Windows XP 5.1;x86) and 
"(Sun Microsystems Inc.)" ;-)

d) A request interceptor that checks a header for compliance is a _really_ good 
idea. I am in favor of enabling such verification interceptors by default. 
People will never learn to comply with specifications unless exceptions are 
thrown into their faces. Misbehaviour must be punished, immediately and without 
mercy (Dubious API Dictator Roland ;-)

cheers,
  Roland


> User-Agent string violates RFC
> ------------------------------
>
>                 Key: HTTPCLIENT-655
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-655
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 3.1 RC1
>            Reporter: Ortwin Glück
>            Priority: Minor
>
> Our User-Agent says "Jakarta Commons-HttpClient/3.1-rc1". But space is a 
> reserved character to separate individual *products* and comments according 
> to RFC 2616, section 14.43. Jakarta is not a product. At the same time we may 
> want to drop the Jakarta name altogether.
> We should change this to something more standard like: 
> "Apache-HttpClient/3.1-rc1 ("+ System.getProperty("os.name") +";"+ 
> System.getProperty("os.arch") +") "+
> "Java/"+ System.getProperty("java.vm.version") +" ("+ 
> System.getProperty("java.vm.vendor") +")"
> which renders:
> "Apache-HttpClient/3.1-rc1 (Windows XP 5.1;x86) Java/1.5.0_08 (Sun 
> Microsystems Inc.)"
> Sun's internal Http client uses something like "Java/1.5.0_08".
> I am completely ignoring the fact that real-world user agents use almost 
> arbitrary strings.
> Some fine examples of misbehaviour from my private logs:
> "Jakmpqes dihurxf wfyiupsc" -- apparently somebody has to hide something...
> "Missigua Locator 1.9"
> "Poodle predictor 1.0"
> "shelob v1.0"
> "ISC Systems iRc Search 2.1"
> "ping.blogug.ch aggregator 1.0"
> "http://www.uni-koblenz.de/~flocke/robot-info.txt";  -- ...sigh
> I am very tempted to write a User-Agent string validator that prevents misuse 
> of this field in HttpClient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to