[ 
http://issues.apache.org/jira/browse/NUTCH-306?page=comments#action_12416673 ] 

Sami Siren commented on NUTCH-306:
----------------------------------

This patch does not seem to apply anymore, can you please attach a patch 
against current  svn trunk.

> DistributedSearch.Client liveAddresses concurrency problem
> ----------------------------------------------------------
>
>          Key: NUTCH-306
>          URL: http://issues.apache.org/jira/browse/NUTCH-306
>      Project: Nutch
>         Type: Bug

>   Components: searcher
>     Versions: 0.7, 0.8-dev
>     Reporter: Grant Glouser
>     Assignee: Sami Siren
>     Priority: Critical
>  Attachments: DistributedSearch.java-patch
>
> Under heavy load, hits returned by DistributedSearch.Client can become out of 
> sync with the Client's live server list.
> DistributedSearch.Client maintains an array of live search servers 
> (liveAddresses).  This array is updated at intervals by a watchdog thread.  
> When the Client returns hits from a search, it tracks which hits came from 
> which server by saving an index into the liveAddresses array (as Hit.indexNo).
> The problem occurs when the search servers cannot service some remote 
> procedure calls before the client times out (due to heavy load, for example). 
>  If the Client returns some Hits from a search, and then the array of 
> liveAddresses changes while the Hits are still being used, the indexNos for 
> those Hits can become invalid, referring to different servers than the Hit 
> originated from (or no server at all!).
> Symptoms of this problem include:
> - ArrayIndexOutOfBoundsException (when the array of liveAddresses shrinks, a 
> Hit from the last server in liveAddresses in the previous update cycle now 
> has an indexNo past the end of the array)
> - IOException: read past EOF (suppose a hit comes back from server A with a 
> doc number of 1000.  Then the watchdog thread updates liveAddresses and now 
> the Hit looks like it came from server B, but server B only has 900 
> documents.  Trying to get details for the hit will read past EOF in server 
> B's index.)
> - Of course, you could also get a "silent" failure in which you find a hit on 
> server A, but the details/summary are fetched from server B.  To the user, it 
> would simply look like an incorrect or nonsense hit.
> We have solved this locally by removing the liveAddresses array.  Instead, 
> the watchdog thread updates an array of booleans (same size as the array of 
> defaultAddresses) that indicate whether that address responded to the latest 
> call from the watchdog thread.  Hit.indexNo is then always an index into the 
> complete array of defaultAddresses, so it is stable and always valid.  
> Callers of getDetails()/getSummary()/etc. must still be aware that these 
> methods may return null when the corresponding server is unable to respond.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to