[ https://issues.apache.org/jira/browse/NUTCH-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298757#comment-15298757 ]
Joseph Naegele commented on NUTCH-1687: --------------------------------------- Any issues with Tien's updated patch? > Pick queue in Round Robin > ------------------------- > > Key: NUTCH-1687 > URL: https://issues.apache.org/jira/browse/NUTCH-1687 > Project: Nutch > Issue Type: Improvement > Components: fetcher > Reporter: Tien Nguyen Manh > Priority: Minor > Attachments: NUTCH-1687-2.patch, NUTCH-1687.patch, > NUTCH-1687.tejasp.v1.patch > > > Currently we chose queue to pick url from start of queues list, so queue at > the start of list have more change to be pick first, that can cause problem > of long tail queue, which only few queue available at the end which have many > urls. > public synchronized FetchItem getFetchItem() { > final Iterator<Map.Entry<String, FetchItemQueue>> it = > queues.entrySet().iterator(); ==> always reset to find queue from > start > while (it.hasNext()) { > .... > I think it is better to pick queue in round robin, that can make reduce time > to find the available queue and make all queue was picked in round robin and > if we use TopN during generator there are no long tail queue at the end. -- This message was sent by Atlassian JIRA (v6.3.4#6332)