----- Original Message ----- From: "Doğacan Güney" <[EMAIL PROTECTED]>
Sent: Friday, June 08, 2007 11:25 PM

On 6/8/07, Enzo Michelangeli <[EMAIL PROTECTED]> wrote:
[...]
A more serious problem is that an implementation of equals() that returns
true if the two hashCodes are different violates the specifications of
Object.hashCode() :

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Object.html#hashCode()
"If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce the same
integer result."

We can just update Configuration.hashCode to calculate hash by summing
hashCode's of all key,value pairs. This should make it equal,
shouldn't it?

Sure, in fact it would be much better. I didn't mention it because it affects Hadoop, about the innards of which I know very little, and I was concerned about unforeseen side-effects. In that spirit (think global, act local ;-) ), we could also subclass org.apache.hadoop.conf.Configuration only for use by methods of PluginRepository, and override its hashCode() method instead of touching the original class.

I think so too. When a map task ends and another begins, there will be
no strong references to the configuration object of the previous map
task, so it may be garbage-collected. Nicolas Lichtmaier has a patch
for this to change WeakHashMap to a form of LRU map.

BTW, this problem has been discussed before (most recently at
http://www.nabble.com/Plugins-initialized-all-the-time ). There even
is an open issue for this - NUTCH-356. I would suggest that we move
our discussion there so that we can all work on this together and fix
this once and for all. I will update the issue with the most recent
discussions.

OK, I'll subscribe to nutch-dev as well.

Cheers --

Enzo

Reply via email to