Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread MengYing Wang
Dear Prof Mattmann, Yes, I will create a jira and attach the patch. But one more thing, do you happen to know how to modify the parse-tika configuration files to automatically download the rome-0.9.jar instead of the rome-1.0.jar? Currently, if you run the "ant -f ./build-ivy.xml" command in the

Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread Chris Mattmann
Great, can you attach a patch for this? Chris Mattmann chris.mattm...@gmail.com -Original Message- From: MengYing Wang Date: Thursday, November 20, 2014 at 7:02 PM To: Lewis John Mcgibbney Cc: "dev@nutch.apache.org" , NSF Polar CyberInfrastructure DR Students

Re: [nsf-polar-usc-students] Nutch in Windows: Failed to set permissions of path

2014-11-20 Thread Lewis John Mcgibbney
This is not a good workaround at all. There are many reasons why this is not a good idea. If I were you, I would seriously suggest you download and work with VirtualBox on a Linux image. It will make your life so much easier anf the barrier to entry is very low these days. Lewis On Thu, Nov 20, 20

Nutch in Windows: Failed to set permissions of path

2014-11-20 Thread MengYing Wang
Hi everyone, If you run the Nutch on Windows using the Cygwin, it may fail due to a permission error. $./crawl urls crawlId http://localhost:8983/solr/collection1 2 2014-11-17 15:39:25,041 ERROR security.UserGroupInformation - PriviledgedActionException as:YangLu cause:java.io.IOException: Failed

Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread Lewis John Mcgibbney
Great...maybe this is a bug in the Tika codebase! On Thu, Nov 20, 2014 at 10:02 AM, MengYing Wang wrote: > Dear Lewis, > > Problem solved by replacing the rome-1.0.jar back to rome-0.9.jar in > parse-tika. > Same idea as the feed parser in > https://issues.apache.org/jira/browse/NUTCH-1494. Tha

Re: [nsf-polar-usc-students] ExceptionInInitializerError caused by NPE

2014-11-20 Thread MengYing Wang
Dear Lewis, Problem solved by replacing the rome-1.0.jar back to rome-0.9.jar in parse-tika. Same idea as the feed parser in https://issues.apache.org/jira/browse/NUTCH-1494. Thanks. Best, Mengying (Angela) Wang On Wed, Nov 19, 2014 at 9:08 PM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> w

Re: Where happens the inject of Redirects and outlinks?

2014-11-20 Thread Alfonso Nishikawa
Hi, I found it in updatedb. Exactly here: svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateReducer.java?view=markup#l207 What was happening is that I had db.update.additions.allowed=false and it filters out too the redirects :\ (error after upgrading to 2.3-SNAPSHO