[ 
http://issues.apache.org/jira/browse/NUTCH-169?page=comments#action_12362438 ] 

Andrzej Bialecki  commented on NUTCH-169:
-----------------------------------------

Overall, good work! I have some comments regarding the details:

* I wonder what is the performance impact of this patch - in many places, where 
previously we used the static methods on classes initialized once per JVM 
lifetime, now we instantiate multiple instances of heavyweight objects like 
NutchConf, PluginRepository etc... I guess we'll see. ;-)

* the use of the CACHE field in filters (e.g. in QueryFilters, URLFilters, 
IndexingFilters) and factories (e.g. ProtocolFactory) troubles me, because 
there is very little chance we ever benefit from using this CACHE - please note 
that now e.g. QueryFilters are instantiated and discarded many times during one 
task, so caching filter instances doesn't help because the CACHE is discarded 
too. Perhaps the caching of instances of QueryFilters inside NutchConf (like 
you do now with PluginRepository) could solve this.

* there are some spurious or duplicate import statements, this needs to be 
cleaned up.

* there is one very strange import from Ant, in Content.java. This needs to be 
removed.

* there is one use of the old deprecated API getExtentens() (I know, the 
original code used that, but it's a good moment to replace it).

* please observe the coding style (whitespace and formatting). Nutch uses the 
Sun Coding Style. The patch is somewhat sloppy in this regard, there are 
missing or superfluous spaces (especially where the "static" qualifier was 
removed), non-aligned indents, commented out old code, strange line breaks on 
short lines, etc. Even if this is not essential for the functionaliy, it is 
still important for further maintenance, so please clean this up.

* for overridden setConf/getConf, is there any point to add the non-javadoc 
comments? I suggest to skip them altogether, they only clutter the source. The 
methods are obvious, and the javadoc will be copied from the interface javadocs.

Other than that, looks great. :-)

> remove static NutchConf
> -----------------------
>
>          Key: NUTCH-169
>          URL: http://issues.apache.org/jira/browse/NUTCH-169
>      Project: Nutch
>         Type: Improvement
>     Reporter: Stefan Groschupf
>     Priority: Critical
>      Fix For: 0.8-dev
>  Attachments: NutchConf.Fetcher.060111.patch, NutchConf.Http.060111.patch, 
> NutchConf.RegexURLFilter.060111.patch, nutchConf.patch
>
> Removing the static NutchConf.get is required for a set of improvements and 
> new features.
> + it allows a better integration of nutch in j2ee or other systems.
> + it allows the management of nutch from a web based gui (a kind of nutch 
> appliance) which will improve the usability and also increase the user 
> acceptance of nutch
> + it allows to change configuration properties until runtime
> + it allows to implement NutchConf as a abstract class or interface to 
> provide other configuration value sources than xml files. (community request)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to