I have also noticed this. The code explicitly loads an instance of the
plugins for every fetch (well, or parse etc., depending on what you
are doing). This causes OutOfMemoryErrors. So, if you dump the heap,
you can see the filter classes get loaded and the never get unloaded
(they are loaded within their own classloader). So, you'll see the
same class loaded thousands of time, which is bad.

So, in my case, I had to change the way the plugins are loaded.
Basically, I changed all the main plugin loaders (like
URLFilters.java, IndexFilters.java) to be singletons with a single
'getInstance()' method on each. I don't need special configs for
filters so I can deal with singletons.

You'll find the heart of the problem somewhere in the extension point
class(es).  It calls newInstance() an aweful lot. But, the classloader
(one per plugin) never gets destroyed, or something so.... this can be
nasty.

I'm still dealing with my OutOfMemory errors on parsing, yuck.





On 5/29/07, Doğacan Güney <[EMAIL PROTECTED]> wrote:
> Hi,
>
> On 5/28/07, Nicolás Lichtmaier <[EMAIL PROTECTED]> wrote:
> > I'm having big troubles with nutch 0.9 that I hadn't with 0.8. It seems
> > that the plugin repository initializes itself all the timem until I get
> > an out of memory exception. I've been seeing the code... the plugin
> > repository mantains a map from Configuration to plugin repositories, but
> > the Configuration object does not have an equals or hashCode method...
> > wouldn't it be nice to add such a method (comparing property values)?
> > Wouldn't that help prevent initializing many plugin repositories? What
> > could be the cause to may problem? (Aaah.. so many questions... =) )
>
> Which job causes the problem? Perhaps, we can find out what keeps
> creating a conf object over and over.
>
> Also, I have tried what you have suggested (better caching for plugin
> repository) and it really seems to make a difference. Can you try with
> this patch(*) to see if it solves your problem?
>
> (*) http://www.ceng.metu.edu.tr/~e1345172/plugin_repository_cache.patch
>
> >
> > Bye!
> >
>
>
> --
> Doğacan Güney
>


-- 
"Conscious decisions by conscious minds are what make reality real"
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to