Tomcat uses own slightly different version of URL class: http://tomcat.apache.org/tomcat-5.5-doc/catalina/docs/api/index.html URL is designed to provide public APIs for parsing and synthesizing Uniform Resource Locators as similar as possible to the APIs of java.net.URL, but without the ability to open a stream or connection. One of the consequences of this is that you can construct URLs for protocols for which a URLStreamHandler is not available (such as an "https" URL when JSSE is not installed).
Synchonized staff in java.net.URL is URLStreamHandler -related. > -----Original Message----- > From: Fuad Efendi [mailto:f...@efendi.ca] > Sent: December-09-09 5:40 PM > To: nutch-dev@lucene.apache.org > Subject: RE: java.net.URL synchronization > > I checked java.net.URL; yes, Nutch and BIXO implicitly use synchronized > Hashtable: > > > public URL(String protocol, String host, int port, String file, > URLStreamHandler handler) throws MalformedURLException { > > ... > if (handler == null && > (handler = getURLStreamHandler(protocol)) == null) { > throw new MalformedURLException("unknown protocol: " + > protocol); > } > > ... > > > However, I don't think it hurts because both architecture (at least, BIXO) > run single thread in a single JVM to process, for instance, Outlinks. Only > "Fetch" part is multithreaded, but it doesn't use URL class. > > > Not sure about Nutch, how the fetch list is generated... if multithreaded > then "shared" between threads RegexUrlNormalizer is even bigger problem... > > > Fuad Efendi > +1 416-993-2060 > http://www.tokenizer.ca/ > Data Mining, Vertical Search > > > > -----Original Message----- > > From: Otis Gospodnetic [mailto:ogjunk-nu...@yahoo.com] > > Sent: December-09-09 5:12 PM > > To: nutch-dev@lucene.apache.org > > Subject: java.net.URL synchronization > > > > Hello, > > > > Has anyone seen this: > > http://www.supermind.org/blog/580/java-net-url-synchronization- > bottleneck > > ? > > > > Is this something that needs to be addressed in Nutch (and thus in Bixo, > > and thus in the common crawler project)? > > > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch > > Fuad Efendi +1 416-993-2060 http://www.linkedin.com/in/liferay