Hi Stefan,
did you find a solution?
I'd really like to give the admin gui a try.
Cheers
Karsten
PS: My offer to host that file is still open :-)
Stefan Groschupf schrieb:
I think it should be possible to put your binary at the Apache site,
probably Doug will be the right person to talk
Hello,
2006/5/3, Stefan Groschupf [EMAIL PROTECTED]:
As soon you have the file uploaded to your server please publish the link
immediately in nutch user mailing list to load balance the traffic as much
we can.
I have put your file here:
Hello,
I have used the intranet crawl for the following simple task:
Given a list of relevant starturls,
get all documents within the reach of two clicks.
We use this mechanism for monitoring a couple of dozen lists on the
internet.
This was easy using the -depth parameter of the crawl tool.
Dear list,
I would like to process metadata from publication repositories into a
nutch index.
The metadata comes as xml (OAI_PMH to be more precise).
The starting URLs look like
http://oai_host/servlet?method=getRecordsset=someSet
Theses requests return lists,
which basically look like
list
Hi,
in my opinion
Julius Schorzman wrote:
http://www.apache.com
is not matched by the regex
+^http://([a-z0-9]*\.)*apache.com/
as it does not end with a trailing slash.
Cheers
Karsten
-
Using Tomcat but need to do
Hi Bill,
this starts the process as root?
That shouldn't be necessary.
One recommended way is to run tomcat as a daemon using jsvc,
see
http://tomcat.apache.org/tomcat-5.5-doc/setup.html
Works fine for me.
Cheers
Karsten
Bill Goffe wrote:
Last month I mentioned that I was having problems
Have you enabled the tomcat manager application at all?
If not, see
http://tomcat.apache.org/tomcat-5.0-doc/manager-howto.html
There are easier ways to deploy a war file, anyway.
Cheers
Karsten
-
Using Tomcat but need to
?
Any help is still very much appreciated!
Best regards
Karsten
-- Forwarded message --
From: Karsten Dello [EMAIL PROTECTED]
Date: 06.12.2006 02:44
Subject: Problem with fetching (cont.)
To: nutch-user@lucene.apache.org
Sorry, the mail I just sent was incomplete
hello,
i just migrated from 0.8.1 to 0.9 and ran into a problem with parsing
(we do parsing after fetching) of a 50 pages segment.
the process is using 0% cpu, but a lot of memory (goes like that for
hours). it seems to be stalled according the logfiles.
PID USER PR NI VIRT RES
Hi,
fetching via https doesn't work with protocol-httpclient if a proxy is
used.
The attached patch in
http://issues.apache.org/jira/browse/NUTCH-126
solved this problem 1 1/2 years ago, but it is not in 0.9 release nor in
trunk:
10 matches
Mail list logo