Hello,
Has anyone seen this:
http://www.supermind.org/blog/580/java-net-url-synchronization-bottleneck ?
Is this something that needs to be addressed in Nutch (and thus in Bixo, and
thus in the common crawler project)?
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
I checked java.net.URL; yes, Nutch and BIXO implicitly use synchronized
Hashtable:
public URL(String protocol, String host, int port, String file,
URLStreamHandler handler) throws MalformedURLException {
...
if (handler == null
(handler =
Tomcat uses own slightly different version of URL class:
http://tomcat.apache.org/tomcat-5.5-doc/catalina/docs/api/index.html
URL is designed to provide public APIs for parsing and synthesizing Uniform
Resource Locators as similar as possible to the APIs of java.net.URL, but
without the ability
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/1007/changes
Changes:
[kubes] Remove old jetty jars that should have been removed with NUTCH-768,
upgrade to Hadoop 0.20.1
--
[...truncated 4727 lines...]
jar:
init:
init-plugin:
deps-jar: