Just to add another example of a NetBeans bug related to equals/hashCode on URL (fixed last year): https://netbeans.org/bugzilla/show_bug.cgi?id=267962 ("30s delay upon first connecting to a localhost database (due to URLClassLoader trying to DNS-resolve nbinst URL in URL.hashCode)")
-- Eirik On 10/29/17, 5:45 AM, "Lars Bruun-Hansen" <[email protected]<mailto:[email protected]>> wrote: Thank you for this. Much appreciated. I for one wasn't aware of this. I had to look into the JDK source because I refused to believe you :-) You are right. I've done some basic grepping for "Map[:space:]*<[:space:]*URL" on the NetBeans source code and as I feared this pattern appears all over the place. The reason why it is not as bad as it seems is that the vast majority of these are really jar URLs, and these URLs have their own URLStreamHandler in the JDK which overrides the sameFile() method (used in equals()) and hashCode() methods so that there's no longer any name lookup performed. This is also true for the one you point to, namely PatchModuleFileManager. Out of the box with JDK, the issue you point to affects "http(s)://", "ftp://", "file://", "mailto://", but not "jar://". On top of this - and perhaps more importantly - I can see the code base makes heavy use of its own custom URLStreamHandlers and URLStreamHandlerFactories so I would guess that this has been taken care of. The developers have indeed been aware of this as far as I can tell. All in all I think this has been taken into consideration. But you are 100% right that most people would not expect url1.equals(url2) to trigger a name service lookup (DNS query). Ouch! I had to go over my own contribution, PR 161, but luckily it uses caching by URI, not caching by URL. Phew. Thanks. On Sat, Oct 28, 2017 at 7:16 PM, Victor Williams Stafusa da Silva < [email protected]<mailto:[email protected]>> wrote: I was looking around an issue, when I landed in the class PatchModuleFileManager ( https://github.com/apache/incubator-netbeans/blob/ master/java.source.base/src/org/netbeans/modules/java/source/parsing/ PatchModuleFileManager.java, version of 20170921). This class uses java.net.URLs as keys of a HashMap. This is a really bad thing to do because URL has one of the worst implementations of equals(Object) and hashCode() ever invented. Let's see this snippet from the URL.hashCode()'s method javadoc ( https://docs.oracle.com/javase/9/docs/api/java/net/URL.html#hashCode--): > The hash code is based upon all the URL components relevant for URL comparison. As such, this operation is a blocking operation. The equals(Object) method is also similarly bad: > Two hosts are considered equivalent if both host names can be resolved into the same IP addresses; else if either host name can't be resolved, the host names must be equal without regard to case; or both host names equal to null. > Since hosts comparison requires name resolution, this operation is a blocking operation. > Note: The defined behavior for equals is known to be inconsistent with virtual hosting in HTTP. So, both equals(Object) and hashCode() method are blocking operations which do DNS lookups dependent on the state of the internet. Also, it is at least debatable the fact that hashCode() method is synchronized. At least, it caches the hashCode() method results. I'm not sure, but I would not be surprised if it is possible to construct two different instances of URLs from the same String with different hashCodes()'s due to a network topology change in the meantime. This all means that the URL class, by not having a sane equals(Object) and hashCode() implementation shouldn't be used as a key to a HashMap or LinkedHashMap nor be used in a HashSet or LinkedHashSet. In fact, I'm in favor of putting a very big and loud @Deprecated in java.net.URL. It is an innocent-looking, but actually very dangerous, performance bottleneck and bug factory. Although it is actually possible to use it safely if you never call equals(Object) and hashCode(), this is something extremely hard and error-prone to do in practice, since equals(Object) and hashCode() usage is widespread as it should be. And arguably, a class that features ill-conceived equals(Object) and hashCode() methods are itself ill-conceived. However, that is something that should be done with the JDK guys. Now, back to the PatchModuleFileManager, it uses URL as a key to a HashMap. That is nasty, so I propose that it should be replaced with URI or String. The PatchModuleFileManager uses the FileObjects class in the same package, which also uses the URL.equals(object) method. The codebase is huge, and those are just the first two classes that I checked, so it is very likely that the URL class is misused in a lot of other places in the code base, deserving a major effort to check for that. I'm not familiar with the internal details of netbeans nor is that anything that I could quicky learn, so I'm not in position of creating a PR to fix that, so I will just raise the alarm here. Victor Williams Stafusa da Silva
