Hi,
I found it in updatedb. Exactly here:
svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateReducer.java?view=markup#l207
What was happening is that I had db.update.additions.allowed=false and it
filters out too the redirects :\ (error after upgrading to
Great...maybe this is a bug in the Tika codebase!
On Thu, Nov 20, 2014 at 10:02 AM, MengYing Wang mengyingwa...@gmail.com
wrote:
Dear Lewis,
Problem solved by replacing the rome-1.0.jar back to rome-0.9.jar in
parse-tika.
Same idea as the feed parser in
Hi everyone,
If you run the Nutch on Windows using the Cygwin, it may fail due to a
permission error.
$./crawl urls crawlId http://localhost:8983/solr/collection1 2
2014-11-17 15:39:25,041 ERROR security.UserGroupInformation -
PriviledgedActionException as:YangLu cause:java.io.IOException:
This is not a good workaround at all.
There are many reasons why this is not a good idea.
If I were you, I would seriously suggest you download and work with
VirtualBox on a Linux image. It will make your life so much easier anf the
barrier to entry is very low these days.
Lewis
On Thu, Nov 20,
Great, can you attach a patch for this?
Chris Mattmann
chris.mattm...@gmail.com
-Original Message-
From: MengYing Wang mengyingwa...@gmail.com
Date: Thursday, November 20, 2014 at 7:02 PM
To: Lewis John Mcgibbney lewis.mcgibb...@gmail.com
Cc:
Dear Prof Mattmann,
Yes, I will create a jira and attach the patch. But one more thing, do you
happen to know how to modify the parse-tika configuration files to
automatically download the rome-0.9.jar instead of the rome-1.0.jar?
Currently, if you run the ant -f ./build-ivy.xml command in the
6 matches
Mail list logo