Re: [Nutch-dev] Plugins initialized all the time!

2007-06-08 Thread Briggs
I should have used the word "encapsulate" instead of "store". :-) On 6/8/07, Briggs <[EMAIL PROTECTED]> wrote: > Well, you could always 'freeze' it, just create a decorator for it. So, > create a new Configuration (call it ImmutableConfiguration) store the > original configuration object in it,

Re: [Nutch-dev] Plugins initialized all the time!

2007-06-08 Thread Briggs
Well, you could always 'freeze' it, just create a decorator for it. So, create a new Configuration (call it ImmutableConfiguration) store the original configuration object in it, and delegate the methods appropriately. Wouldn't that work? On 6/8/07, Doğacan Güney <[EMAIL PROTECTED]> wrote:

Re: [Nutch-dev] Plugins initialized all the time!

2007-06-08 Thread Doğacan Güney
On 5/31/07, Nicolás Lichtmaier <[EMAIL PROTECTED]> wrote: > > > Actually thinking a bit further into this, I kind of agree with you. I > > initially thought that the best approach would be to change > > PluginRepository.get(Configuration) to PluginRepository.get() where > > get() just creates a con

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-31 Thread Nicolás Lichtmaier
> Actually thinking a bit further into this, I kind of agree with you. I > initially thought that the best approach would be to change > PluginRepository.get(Configuration) to PluginRepository.get() where > get() just creates a configuration internally and initializes itself > with it. But then we

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-31 Thread Doğacan Güney
On 5/30/07, Doğacan Güney <[EMAIL PROTECTED]> wrote: > On 5/30/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: > > Doğacan Güney wrote: > > > > > My patch is just a draft to see if we can create a better caching > > > mechanism. There are definitely some rough edges there:) > > > > One important in

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-30 Thread Doğacan Güney
On 5/30/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: > Doğacan Güney wrote: > > > My patch is just a draft to see if we can create a better caching > > mechanism. There are definitely some rough edges there:) > > One important information: in future versions of Hadoop the method > Configuration.

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-30 Thread Andrzej Bialecki
Doğacan Güney wrote: > My patch is just a draft to see if we can create a better caching > mechanism. There are definitely some rough edges there:) One important information: in future versions of Hadoop the method Configuration.setObject() is deprecated and then will be removed, so we have to

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Doğacan Güney
Hi, On 5/29/07, Nicolás Lichtmaier <[EMAIL PROTECTED]> wrote: > > > Which job causes the problem? Perhaps, we can find out what keeps > > creating a conf object over and over. > > > > Also, I have tried what you have suggested (better caching for plugin > > repository) and it really seems to make

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Nicolás Lichtmaier
>> I'm having big troubles with nutch 0.9 that I hadn't with 0.8. It seems >> that the plugin repository initializes itself all the timem until I get >> an out of memory exception. I've been seeing the code... the plugin >> repository mantains a map from Configuration to plugin repositories, but >

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Nicolás Lichtmaier
> Which job causes the problem? Perhaps, we can find out what keeps > creating a conf object over and over. > > Also, I have tried what you have suggested (better caching for plugin > repository) and it really seems to make a difference. Can you try with > this patch(*) to see if it solves your pr

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Briggs
I'll have to get around to trying this in the future. I have already 'forked' the code. But, would like to get back on track too. So, guess I will post something, someday. The plugin part is now the least of my worries. Again, the parsing is what is killing me now. I don't use nutch in the 'o

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Doğacan Güney
On 5/29/07, Briggs <[EMAIL PROTECTED]> wrote: > I have also noticed this. The code explicitly loads an instance of the > plugins for every fetch (well, or parse etc., depending on what you > are doing). This causes OutOfMemoryErrors. So, if you dump the heap, > you can see the filter classes get lo

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Briggs
I have also noticed this. The code explicitly loads an instance of the plugins for every fetch (well, or parse etc., depending on what you are doing). This causes OutOfMemoryErrors. So, if you dump the heap, you can see the filter classes get loaded and the never get unloaded (they are loaded withi

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-29 Thread Doğacan Güney
Hi, On 5/28/07, Nicolás Lichtmaier <[EMAIL PROTECTED]> wrote: > I'm having big troubles with nutch 0.9 that I hadn't with 0.8. It seems > that the plugin repository initializes itself all the timem until I get > an out of memory exception. I've been seeing the code... the plugin > repository manta

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-28 Thread Nicolás Lichtmaier
More info... I see "map" progressing from 0% to 100. It seems to reload plugins whan reaching 100%. Besides, I've realized that each NutchJob is a Configuration, so (as is there's no "equals") a plugin repo would be created per each NutchJob... ---