[ http://issues.apache.org/jira/browse/NUTCH-169?page=all ]

Marko Bauhardt updated NUTCH-169:
---------------------------------

    Attachment: NutchConf.370854.patch

> * a general comment: plugins now implement NutchConfigurable, which means 
> that you had to add two new methods, 
> looking exactly the same, to many classes. That's why the NutchConfigured 
> class was created. 
> I suggest replacing "implements NutchConfigurable" with "extends 
> NutchConfigured" where appropriate. 

In any case i have to implement two methods in the one hand I have to implement 
set and getConf in the other hand I have to overwrite a constructor and a 
setConf. 
Since in many cases a constructor wouldn't helpful i decide to use the 
interface. 
In general it may would be make sense to have interface or an abstract class 
that has just a configure method nothing more. 

> * 1094: I think a better place to set the current config on a protocol 
> instance is inside the ProtocolFactory.getProtocol()
> because now the factory itself is instantiated with an instance of nutchConf, 
> so it keeps a reference to that config. 

Done.

> * 1256: what is this constructor for? I think only the public constructor is 
> used. 

My mistake, fixed


> * 1311: please replace getExtentens() with getExtensions() 
> * 1346, 1375: these classes should be static, I think 
> * 1542, 1570: should be static 

Done.

> * 1903,8476,10136: I wonder, shouldn't we cache these in nutchConf? 

1903: Done.
8476,10136: we cache PluginRepository and not the extensions itself. From my 
point of view in general to move caching or object recycling to the tools that 
use the extensions / objects but not cache the object in it self. 

> * 3154, 3650: what's the point of this line? it was already determined that 
> there is nothing useful there... this line exists also in other similar 
> facades. 

In case the cache is empty I fill the cache inside the if condition in line 
3150. So get the freshly cached values to assign them to the field.


> * 3627: typo, should be indexingFilters. 
Done.



> * 3718, 6514: I think it would be better to create filters once, and keep 
> them around. 
> * 5020: is this an intentional change?? 
> * 6467: I think this change is an error. 
> * 6651: I don't understand this comment... 
> * 6777: should be static 
> * 7045: shouldn't we store these filters too, like all other filters, in 
> nutchConf? 
> * 7132: I think we can cache CLIENT in nutchConf too. 

Fixed.

> * 7782: either we should remove this, or use caching in nutchConf. 

Done, I removed this.

> * 10638: local variable overshadows a superclass variable. 

Done.


> * 1337 and following, inside CommonGrams.java: spurious whitespace, bad 
> formatting 
> * 1510-1539, 1748-1772, 1796, 1896-1907, 2556-2582, 2880, 3124-3160, 3207, 
> 3211-3217, 4295, 
> 4405,4657,4848,5343,5493,6566,6806-6822,6872,7295,7404,7441,7503,7540,7644,7680,7720,
>  
> 7859,7896,7964,7977,8011,8049,8214,8226,8244,8280,8456,8471,9045,9162,9227,9261,9323,9342,
>  
> 9380,9403,9580,9627,9677,9702,9779,9816,9820,9863,9871,9944,9961,10045,10130,10394,10415,
>  
> 11079,11129: inconsistent indenting, should be 2 spaces. Some missing 
> whitespace. 
> * 1613, 1629, 1691, 2262, 2515, 2687, 2861, 3244, 3510, 3774, 3929, 4010, 
> 4157, 4273, 
> 4491,6578,6831,6840,6867,6900,6932,6956,6972,6981,7045,7065,7084,7140,7169,7357,7477,
>  
> 7882,9171,9195,9204,9765,9910: whitespace 
> * 1659, 
> 2905,7404,7503,7540,7644,7683,7728,7890,7972,8015,8053,8151,8393,8471,8614,8741,8799,9005,
>  
> 9049,9221,9342,9403,9589,9627,9702,9779,9820,9871,9961,10130,10410,10672,10765,
>  
> 11129: non-javadoc generated comments should be removed 
> * 2241, 2498, 2534-2538, 4244,6176,6798,7096,7256-7264: junk 

i had done my best, to get all of this fixed.


I fixed also some other problems beside these you mentioned. Anyway, the test 
suite, the crawl-process and the search runs local and in the ndfs successfully 
for me.
Anyway it is a really big thing so please test it again.

Thanks, Marko


> remove static NutchConf
> -----------------------
>
>          Key: NUTCH-169
>          URL: http://issues.apache.org/jira/browse/NUTCH-169
>      Project: Nutch
>         Type: Improvement
>     Reporter: Stefan Groschupf
>     Priority: Critical
>      Fix For: 0.8-dev
>  Attachments: NutchConf.367837.patch, NutchConf.370854.patch, 
> NutchConf.Fetcher.060111.patch, NutchConf.Http.060111.patch, 
> NutchConf.RegexURLFilter.060111.patch, nutchConf.patch
>
> Removing the static NutchConf.get is required for a set of improvements and 
> new features.
> + it allows a better integration of nutch in j2ee or other systems.
> + it allows the management of nutch from a web based gui (a kind of nutch 
> appliance) which will improve the usability and also increase the user 
> acceptance of nutch
> + it allows to change configuration properties until runtime
> + it allows to implement NutchConf as a abstract class or interface to 
> provide other configuration value sources than xml files. (community request)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to