[ http://issues.apache.org/jira/browse/NUTCH-169?page=all ]
Marko Bauhardt updated NUTCH-169:
---------------------------------
Attachment: NutchConf.370854.patch
> * a general comment: plugins now implement NutchConfigurable, which means
> that you had to add two new methods,
> looking exactly the same, to many classes. That's why the NutchConfigured
> class was created.
> I suggest replacing "implements NutchConfigurable" with "extends
> NutchConfigured" where appropriate.
In any case i have to implement two methods in the one hand I have to implement
set and getConf in the other hand I have to overwrite a constructor and a
setConf.
Since in many cases a constructor wouldn't helpful i decide to use the
interface.
In general it may would be make sense to have interface or an abstract class
that has just a configure method nothing more.
> * 1094: I think a better place to set the current config on a protocol
> instance is inside the ProtocolFactory.getProtocol()
> because now the factory itself is instantiated with an instance of nutchConf,
> so it keeps a reference to that config.
Done.
> * 1256: what is this constructor for? I think only the public constructor is
> used.
My mistake, fixed
> * 1311: please replace getExtentens() with getExtensions()
> * 1346, 1375: these classes should be static, I think
> * 1542, 1570: should be static
Done.
> * 1903,8476,10136: I wonder, shouldn't we cache these in nutchConf?
1903: Done.
8476,10136: we cache PluginRepository and not the extensions itself. From my
point of view in general to move caching or object recycling to the tools that
use the extensions / objects but not cache the object in it self.
> * 3154, 3650: what's the point of this line? it was already determined that
> there is nothing useful there... this line exists also in other similar
> facades.
In case the cache is empty I fill the cache inside the if condition in line
3150. So get the freshly cached values to assign them to the field.
> * 3627: typo, should be indexingFilters.
Done.
> * 3718, 6514: I think it would be better to create filters once, and keep
> them around.
> * 5020: is this an intentional change??
> * 6467: I think this change is an error.
> * 6651: I don't understand this comment...
> * 6777: should be static
> * 7045: shouldn't we store these filters too, like all other filters, in
> nutchConf?
> * 7132: I think we can cache CLIENT in nutchConf too.
Fixed.
> * 7782: either we should remove this, or use caching in nutchConf.
Done, I removed this.
> * 10638: local variable overshadows a superclass variable.
Done.
> * 1337 and following, inside CommonGrams.java: spurious whitespace, bad
> formatting
> * 1510-1539, 1748-1772, 1796, 1896-1907, 2556-2582, 2880, 3124-3160, 3207,
> 3211-3217, 4295,
> 4405,4657,4848,5343,5493,6566,6806-6822,6872,7295,7404,7441,7503,7540,7644,7680,7720,
>
> 7859,7896,7964,7977,8011,8049,8214,8226,8244,8280,8456,8471,9045,9162,9227,9261,9323,9342,
>
> 9380,9403,9580,9627,9677,9702,9779,9816,9820,9863,9871,9944,9961,10045,10130,10394,10415,
>
> 11079,11129: inconsistent indenting, should be 2 spaces. Some missing
> whitespace.
> * 1613, 1629, 1691, 2262, 2515, 2687, 2861, 3244, 3510, 3774, 3929, 4010,
> 4157, 4273,
> 4491,6578,6831,6840,6867,6900,6932,6956,6972,6981,7045,7065,7084,7140,7169,7357,7477,
>
> 7882,9171,9195,9204,9765,9910: whitespace
> * 1659,
> 2905,7404,7503,7540,7644,7683,7728,7890,7972,8015,8053,8151,8393,8471,8614,8741,8799,9005,
>
> 9049,9221,9342,9403,9589,9627,9702,9779,9820,9871,9961,10130,10410,10672,10765,
>
> 11129: non-javadoc generated comments should be removed
> * 2241, 2498, 2534-2538, 4244,6176,6798,7096,7256-7264: junk
i had done my best, to get all of this fixed.
I fixed also some other problems beside these you mentioned. Anyway, the test
suite, the crawl-process and the search runs local and in the ndfs successfully
for me.
Anyway it is a really big thing so please test it again.
Thanks, Marko
> remove static NutchConf
> -----------------------
>
> Key: NUTCH-169
> URL: http://issues.apache.org/jira/browse/NUTCH-169
> Project: Nutch
> Type: Improvement
> Reporter: Stefan Groschupf
> Priority: Critical
> Fix For: 0.8-dev
> Attachments: NutchConf.367837.patch, NutchConf.370854.patch,
> NutchConf.Fetcher.060111.patch, NutchConf.Http.060111.patch,
> NutchConf.RegexURLFilter.060111.patch, nutchConf.patch
>
> Removing the static NutchConf.get is required for a set of improvements and
> new features.
> + it allows a better integration of nutch in j2ee or other systems.
> + it allows the management of nutch from a web based gui (a kind of nutch
> appliance) which will improve the usability and also increase the user
> acceptance of nutch
> + it allows to change configuration properties until runtime
> + it allows to implement NutchConf as a abstract class or interface to
> provide other configuration value sources than xml files. (community request)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira