The change doesn't reflect in the screen after I
re-compile the Nutch code and re-launch the tomcat.
Do you re-deploy the web app?
--
http://motrech.free.fr/
http://www.frutch.org/
Hello,
Here is a copy of my previous mail, if someone want to comment it:
I have just committed some modifications that enable to have some
dependencies between plugins.
I would like to apply this mechanism to parse-ms* related plugins that both
uses jakarta poi code.
The idea is: instead
Since the plugins can specify some dependencies each over, it raises an
administrator problem.
For a Nutch administrator, it is not user-friendly to specify which plugins
to activate/deactivate.
With plugin inter-dependencies, the administrator need to know that a plugin
depends on another one
Hi nutch dev,
After fetching about 100 mio of pages I see many search engine spammers
that use an hidden div tag (negative position) to include many urls
that user don't see whe acces the site page. This links alter the boost
(by inlink count) so I want to skip this urls.
How can I do that?
+1!
Am 06.09.2005 um 11:41 schrieb Jérôme Charron:
Since the plugins can specify some dependencies each over, it
raises an
administrator problem.
For a Nutch administrator, it is not user-friendly to specify which
plugins
to activate/deactivate.
With plugin inter-dependencies, the
This idea calls for follow ups -- with plugins that depend on each other
it's just a step towards _order dependence_ (some plugins must be
activated before other plugins, some depend on the status of the plugin
activation, etc). This in fact resembles ANT's target dependency system;
one
Hello,
You cannot do it. These structures where not designed for it. But you can
copy all the data to other ArrayFile skipping entries you want to delete.
Regards
Piotr
On 9/6/05, Ben [EMAIL PROTECTED] wrote:
Hi
How can I delete an entry in the ArrayFile/MapFile if I know the id/key?
Jérôme,
You may should discuss such things before you 'committed' a new
feature that already exists.
I normally ready most of the nutch mails. What was the date and subject?
I may overseen this one.
Stefan
Hi!
I'm writing a webdb purger, and I have an issue with writing to the new db
the links of the pages that haven't been purged.
The docs seem to imply that adding a link having a source page that is not
present in the webdb should fail, but apparently it doesn't.
So I try to filter out the
You may should discuss such things before you 'committed' a new
feature that already exists.
I normally ready most of the nutch mails. What was the date and subject?
I may overseen this one.
I don't know, it's Stefan's sentence, not mine, so, please ask to Stefan.
Regards
Jérôme
--
Hello Massimo.
*-.*-.*-.* would match anything with three dashes or more in it, for
instance. Another more good-looking way would be to use something like
.*(-.*){a,b},
which will match anything with a number of dashes b.
Fredrik
On 9/6/05, Massimo Miccoli [EMAIL PROTECTED] wrote:
Hi,
A bit more info:
The addLink documentation: Links are only permitted in the webdb if they
have a valid source MD5 for a Page that is also in the webdb. Yet I can
insert a link with the MD5 of a page that is not in the webdb.
Also, I can now filter out the offending links by reading both the
12 matches
Mail list logo