Re: Where happens the inject of Redirects and outlinks?

2014-11-20 Thread Alfonso Nishikawa
Hi, I found it in updatedb. Exactly here: svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateReducer.java?view=markup#l207 What was happening is that I had db.update.additions.allowed=false and it filters out too the redirects :\ (error after upgrading to

Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread Lewis John Mcgibbney
Great...maybe this is a bug in the Tika codebase! On Thu, Nov 20, 2014 at 10:02 AM, MengYing Wang mengyingwa...@gmail.com wrote: Dear Lewis, Problem solved by replacing the rome-1.0.jar back to rome-0.9.jar in parse-tika. Same idea as the feed parser in

Nutch in Windows: Failed to set permissions of path

2014-11-20 Thread MengYing Wang
Hi everyone, If you run the Nutch on Windows using the Cygwin, it may fail due to a permission error. $./crawl urls crawlId http://localhost:8983/solr/collection1 2 2014-11-17 15:39:25,041 ERROR security.UserGroupInformation - PriviledgedActionException as:YangLu cause:java.io.IOException:

Re: [nsf-polar-usc-students] Nutch in Windows: Failed to set permissions of path

2014-11-20 Thread Lewis John Mcgibbney
This is not a good workaround at all. There are many reasons why this is not a good idea. If I were you, I would seriously suggest you download and work with VirtualBox on a Linux image. It will make your life so much easier anf the barrier to entry is very low these days. Lewis On Thu, Nov 20,

Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread Chris Mattmann
Great, can you attach a patch for this? Chris Mattmann chris.mattm...@gmail.com -Original Message- From: MengYing Wang mengyingwa...@gmail.com Date: Thursday, November 20, 2014 at 7:02 PM To: Lewis John Mcgibbney lewis.mcgibb...@gmail.com Cc:

Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread MengYing Wang
Dear Prof Mattmann, Yes, I will create a jira and attach the patch. But one more thing, do you happen to know how to modify the parse-tika configuration files to automatically download the rome-0.9.jar instead of the rome-1.0.jar? Currently, if you run the ant -f ./build-ivy.xml command in the