[ http://issues.apache.org/jira/browse/NUTCH-344?page=all ]
Sami Siren closed NUTCH-344. ---------------------------- > Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks > ------------------------------------------------------------------------- > > Key: NUTCH-344 > URL: http://issues.apache.org/jira/browse/NUTCH-344 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.8.1, 0.9.0, 0.8 > Environment: All > Reporter: Greg Kim > Fix For: 0.8.1, 0.9.0 > > Attachments: cleanExpiredServerBlocks.patch, HttpBase.patch > > > With the recent change to the following code in HttpBase.java has tendencies > to block fetcher threads while one thread busy waits... > private static void cleanExpiredServerBlocks() { > synchronized (BLOCKED_ADDR_TO_TIME) { > while (!BLOCKED_ADDR_QUEUE.isEmpty()) { <===== LINE 3: > String host = (String) BLOCKED_ADDR_QUEUE.getLast(); > long time = ((Long) BLOCKED_ADDR_TO_TIME.get(host)).longValue(); > if (time <= System.currentTimeMillis()) { > BLOCKED_ADDR_TO_TIME.remove(host); > BLOCKED_ADDR_QUEUE.removeLast(); > } > } > } > } > LINE3: As long as there are *any* entries in the BLOCKED_ADDR_QUEUE, the > thread that first enters this block busy-waits until it becomes empty while > all other threads block on the synchronized block. This leads to extremely > poor fetcher performance. > Since the checkin to respect crawlDelay in robots.txt, we are no longer > guranteed that BLOCKED_ADDR_TO_TIME queue is a fifo list. The simple fix is > to iterate the queue once rather than busy waiting... -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers