On 5/31/07, Nicolás Lichtmaier <[EMAIL PROTECTED]> wrote:
> Actually thinking a bit further into this, I kind of agree with you. I > initially thought that the best approach would be to change > PluginRepository.get(Configuration) to PluginRepository.get() where > get() just creates a configuration internally and initializes itself > with it. But then we wouldn't be passing JobConf to PluginRepository > but PluginRepository would do something like a > NutchConfiguration.create(), which is probably wrong. > > So, all in all, I've come to believe that my (and Nicolas') patch is a > not-so-bad way of fixing this. It allows us to pass JobConf to > PluginRepository and stops creating new PluginRepository-s again and > again... > > What do you think? IMO a better way would be to add a proper equals() method to Hadoop's Configuration object (and hashcode) that would call getProps().equals(o.getProps()). So that you could use them as keys... Every class which is a map from keys to values has "equals & hashcode" (Properties, HashMap, etc.). Another nice thing would be to be able to "freeze" a configuration object, preventing anyone from modifying it.
I found that there is already an issue for this problem - NUTCH-356. I will update it with most recent discussions. -- Doğacan Güney
