No problem for me. I have just run the test crawl on
http://lucene.apache.org/nutch as described in new tutorial and a lot
of pdf and png files were causing big exceptions and stack traces in
log. I thought that people (usually using nutch for the first time)
might think that they did something
Hello,
We should probably change user agent string in nutch-default.xml to
point to Apache site. The only question is http.agent.version - should
we set it to 0.07 for release and 0.08-dev for future work? I do not
know how it was used previously.
Current values:
property
Hi,
I just wanted to finally add myself to the list of nutch committers on
nutch website and I am not sure how to deploy it.
So I have installed forrest and modified
src/site/src/documentation/content/xdocs.
Than run 'forrest'. And it generated content in src/site/build/site.
And now the
Hello,
In my experience it is very important to use anchor text giving it
quite high boost. It allows me to return http://www.aa.com when user
searches for American Airlines - without using anchor text it was
impossible to achieve - a lot of sites (spam or not) with american
airlines in url and
Hello,
I think it is good idea to release ASAP. I wanted to contribute my code
for fault-tolerant searching - it takes more time than I expected
because as some of you know in meantime I become a father. But I hope I
will be able to send something for comments early next week. I will look
at
Hello,
Tested on cygwin and on linux box. : based syntax is used ealier in
nutch script too. Commited.
Thanks
Piotr
Erik Hatcher wrote:
I'm getting expr: syntax error when running all bin/nutch commands.
It comes from this line:
if expr match `uname` 'CYGWIN*' /dev/null; then
should
Hello,
I understood you have all your segments in
/home/fji/SE/nutch-nightly/crawl.test/
but according to log file you sent nutch is looking in:
/home/fji/SE/tomcat4/segments
Please copy your segment directory from
/home/fji/SE/nutch-nightly/crawl.test/
to
/home/fji/SE/tomcat4/
and restart
101 - 107 of 107 matches
Mail list logo