[ 
https://issues.apache.org/jira/browse/NUTCH-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507626
 ] 

Doğacan Güney commented on NUTCH-501:
-------------------------------------

> Doesn't this patch has the same bug the plugin respository has now? Won't 
> different configuration, which are the same, get different caches?

It is not a bug, it is a feature :). See Andrzej's earlier comment for why we 
need different caches from different configurations that happen to contain the 
same key/value pairs.

Actually, you are right. It is a bug. However, that bug is irrelevant in *this* 
case. Notice that PluginRepository runs out of memory not because we have too 
many active configurations at once. We run out of memory, because, for some 
reason that I don't quite understand yet, loaded plugin classes don't get 
'unloaded'. So, if you are running locally and you have say n total map/reduce 
tasks, for each new task, nutch reloads all plugin classes again and doesn't 
unload them when task is done. So, ObjectCache only leaks stuff that 
PluginRepository leaks anyway (at worst, the overhead is just an extra 
reference to the leaked object). Everything else will be garbage collected when 
a configuration is no longer in use.

To sum up: Yes, different configurations will get different caches. But I 
believe that this will not cause (any more) problems. Feel free to prove me 
wrong :).

> Implement a different caching mechanism for objects cached in configuration
> ---------------------------------------------------------------------------
>
>                 Key: NUTCH-501
>                 URL: https://issues.apache.org/jira/browse/NUTCH-501
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Doğacan Güney
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-501_draft.patch, NUTCH-501_draft_v2.patch
>
>
> As per HADOOP-1343, Configuration.setObject and Configuration.getObject 
> (which are used by Nutch to cache arbitrary objects) are deprecated and will 
> be removed soon. We have to implement an alternative caching mechanism and 
> replace all usages of Configuration.{getObject,setObject} with the new 
> mechanism.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to