+1
--
Sami Siren
Doug Cutting wrote:
I propose we cleanup Nutch's tools as follows.
First, some definitions:
1. An action is an operation on Nutch data. For example,
GenerateSegmentFromDB, FetchSegment, UpdateDB, IndexSegment,
MergeIndexes, SearchServer, etc. are all actions.
2. A tool
the first.
Atleast I like the idea of nutch internally supporting more than one
Collection.
--
Sami Siren
hi,
it seems like you need to update your configuration to point to a class
'org.apache.nutch.net.BasicUrlNormalizer' instead of
'net.nutch.net.BasicUrlNormalizer'
--
Sami Siren
Byron Miller wrote:
i created 100 fetchlists from a 50million url db and
when i try an run fetch i'm getting a few
should try something more basic first.
--
Sami Siren
[ http://issues.apache.org/jira/browse/NUTCH-11?page=history ]
Sami Siren resolved NUTCH-11:
-
Resolution: Fixed
Link.java needs a pre tag so javadoc renders
--
Key: NUTCH-11
URL: http
[ http://issues.apache.org/jira/browse/NUTCH-15?page=history ]
Sami Siren resolved NUTCH-15:
-
Resolution: Fixed
I applied this, but changed the default timeout to 1 msecs in
nutch-default.xml
ipc client timeout should be configurable
[ http://issues.apache.org/jira/browse/NUTCH-4?page=history ]
Sami Siren updated NUTCH-4:
---
Attachment: query_parser_unbalanced_fix.tar.gz
changed as described by Piotr Kosiorowski. pls follow up with the additional
unit tests/comments
Serious bug
[
http://issues.apache.org/jira/browse/NUTCH-60?page=comments#action_12313316 ]
Sami Siren commented on NUTCH-60:
-
Do you have some ready made scripts you used to measure the performance
(quality and speed) that I could use to see if my additional