[
https://issues.apache.org/jira/browse/NUTCH-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Leon Misakyan updated NUTCH-2253:
---------------------------------
Description:
Hi, as I can see in 1.11 release ProtocolFactory clas still has an issue in
getProtocol method. This is because every fetcher thread has its own
ProtocolFactory instance (this.protocolFactory = new ProtocolFactory(conf); in
FetcherThread constructor.)
So have this method synchronized is useless, because each thread has its own
monitor.
In our project we have issue of having multiple Protocol instances.
Issue can be fixed if getProtocol method will use shared conf instance as lock
object or by having one ProtocolFactory for all fetcher threads.
was:
The method getProtocol() should be synchronized otherwise the Fetcher threads
can access it around the same time and query the cache before it's had a chance
of being populated properly. This would happen for a handful of calls until the
subsequent ones get the cache but this should be fixed nonetheless e.g. when we
want a guarantee that the same Protocol instance will be called for the same
fetching session.
The other Factor classes which use the same cache mechanism would suffer from
the same problem.
> ProtocolFactory still not thread-safe
> -------------------------------------
>
> Key: NUTCH-2253
> URL: https://issues.apache.org/jira/browse/NUTCH-2253
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 1.10, 1.11
> Reporter: Leon Misakyan
> Fix For: 2.3, 1.8
>
>
> Hi, as I can see in 1.11 release ProtocolFactory clas still has an issue in
> getProtocol method. This is because every fetcher thread has its own
> ProtocolFactory instance (this.protocolFactory = new ProtocolFactory(conf);
> in FetcherThread constructor.)
> So have this method synchronized is useless, because each thread has its own
> monitor.
> In our project we have issue of having multiple Protocol instances.
> Issue can be fixed if getProtocol method will use shared conf instance as
> lock object or by having one ProtocolFactory for all fetcher threads.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)