Re: NullPointerException with trunk

2007-11-27 Thread Dennis Kubes
Yes, this is currently a bug in trunk which errors out when the content for a given url is null. This bug is in process of being fixed. Dennis Alexis Votta wrote: I have updated my copy of Nutch from subversion to revision 597822. With minimal settings like nutch-site.xml,

Re: NullPointerException when tying to init NutchBean

2007-10-12 Thread Wolfgang Woerndl
Thanks, now it works, just some feedback for everybody: - Including the Nutch conf directory in the classpath solved the NPE - I really need to set the path to the index dir in the NutchBean constructor, otherwise I get 0 hits (despite having a searcher.dir proporty with the path in

Re: NullPointerException when tying to init NutchBean

2007-10-07 Thread Dennis Kubes
My guess is seeing your error below is that you didn't move over the common-terms.utf8 or other needed files from the nutch conf directory into the classpath of your web application. Dennis Kubes Wolfgang Woerndl wrote: Hello, I installed Nutch 0.8.1., crawled some Web pages and get

Re: NullPointerException when tying to init NutchBean

2007-10-05 Thread Sagar Naik
Hey, I would like to mention 2 points : - The nutch config files shud be in the classpath. - The 2nd arg in NutchBean ctor is the path to index dir I guess this shud solve the NPE Wolfgang Woerndl wrote: Hello, I installed Nutch 0.8.1., crawled some Web pages and get (meaningful) results

Re: NullPointerException while fetching

2007-09-17 Thread eyal edri
can u show the source html file that produces this exception? there's an issue with pages that don't mention the content type in the header (ususally in redirects) so nutch throws exception. if that is the case, there's a code line in Content.java that needs to modified, On 9/18/07,

Re: SOLVED? Re: NullPointerException fetching some sites with temp redirects

2007-07-26 Thread Carl Cerecke
Carl Cerecke wrote: The problem is that the contentType for the page (that it was redirected to) is null. Changing Content.java:165 to: Text.writeString(out, contentType != null ? contentType : ); // write contentType fixes the problem. But is empty string better for an unknown content

SOLVED? Re: NullPointerException fetching some sites with temp redirects

2007-07-26 Thread Carl Cerecke
The problem is that the contentType for the page (that it was redirected to) is null. Changing Content.java:165 to: Text.writeString(out, contentType != null ? contentType : ); // write contentType fixes the problem. But is empty string better for an unknown content type or something like

Re: NullPointerException fetching some sites with temp redirects

2007-07-26 Thread Kai_testing Middleton
I'll try those if I get a chance. (BTW Remuneration is misspelled on absoluteit.co.nz if you care) --Kai M. - Original Message From: Carl Cerecke [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Thursday, July 26, 2007 4:21:07 PM Subject: Re: NullPointerException fetching some

Re: NullPointerException fetching some sites with temp redirects

2007-07-26 Thread Carl Cerecke
Is anybody else getting NullPointerExceptions fetching either of these two sites (0.90 and latest from trunk) ? http://www.absoluteit.co.nz http://defence.allmedia.co.nz I am, but would be grateful if someone else could test whether they work or not so I can eliminate nutch configuration

Re: SOLVED? Re: NullPointerException fetching some sites with temp redirects

2007-07-26 Thread Doğacan Güney
On 7/27/07, Carl Cerecke [EMAIL PROTECTED] wrote: Carl Cerecke wrote: The problem is that the contentType for the page (that it was redirected to) is null. Changing Content.java:165 to: Text.writeString(out, contentType != null ? contentType : ); // write contentType fixes the

Re: NullPointerException fetching some sites with temp redirects

2007-07-25 Thread Doğacan Güney
Hi, On 7/25/07, Carl Cerecke [EMAIL PROTECTED] wrote: Hi, Using nutch 0.9, although I get the same with a more recent nightly build. I'm getting NPE fetching these two pages: http://www.absoluteit.co.nz and http://defence.allmedia.co.nz I've tracked it down by putting a t.printStackTrace()

Re: NullPointerException fetching some sites with temp redirects

2007-07-25 Thread Carl Cerecke
Hi Doğacan, Yes, I get the NullPointerException with the latest trunk, too. Cheers, Carl. Doğacan Güney wrote: Hi, On 7/25/07, Carl Cerecke [EMAIL PROTECTED] wrote: Hi, Using nutch 0.9, although I get the same with a more recent nightly build. I'm getting NPE fetching these two pages:

Re: NullPointerException fetching some sites with temp redirects

2007-07-25 Thread Carl Cerecke
Hi, Included Content.java. Will retry with latest trunk shortly. Content.java:137-149 137 protected final void writeCompressed(DataOutput out) throws IOException { 138out.writeByte(VERSION); 139 140Text.writeString(out, url); // write url 141Text.writeString(out, base); // write

Re: NullPointerException during Fetch

2007-04-09 Thread Meryl Silverburgh
Thanks . I attached my nutch-site.xml file. But for some reason, I now get: $ bin/nutch fetch $s1 Fetcher: starting Fetcher: segment: crawl/segments/20070409222306 Fetcher: java.io.IOException: Segment already fetched! at

Re: NullPointerException during Fetch

2007-04-08 Thread Ratnesh,V2Solutions India
open nutch-site.xml and nutch-default.xml and in the plugin.includesproperty set value like valueindex-basic|index-more|./value with the other values only include these plugins as extra. Ratnesh,V2Solutions India Meryl Silverburgh

Re: NullPointerException during Fetch

2007-04-07 Thread Ratnesh,V2Solutions India
Check whether you have included index-basic index-more plugin in your nutch-site.xml file the same problem was solved including this file. hope this will solve the issue... Ratnesh V2Solutions,India Meryl Silverburgh wrote: HI, I am following the

Re: NullPointerException during Fetch

2007-04-07 Thread Meryl Silverburgh
Thanks. but how to include the index-basic, index-more plugin? I don' t can't find that in the documentation. Thank you. On 4/7/07, Ratnesh,V2Solutions India [EMAIL PROTECTED] wrote: Check whether you have included index-basic index-more plugin in your nutch-site.xml file the same problem

Re: NullPointerException due to nonexistent (mis-pointed) segments directory

2006-04-18 Thread Michael Levy
I hope someone can help me with this problem. This works fine: #bin/nutch crawl urls.txt and it creates a directory named something like crawl-20060418105008, with a working index. However if I try to add any parameters beyond the root_url_file parameter I get the output below. I'm really

Re: NullPointerException

2006-03-06 Thread Howie Wang
I didn't see query-basic/query-more on your list of plugins included. This is what handles most queries usually. query-url will only handle parts of the query that look like url:http://www.google.com, and query-site handles site:www.google.com. Nothing seems to be handling just regular text in

Re: NullPointerException

2006-03-06 Thread Hasan Diwan
On 06/03/06, Howie Wang [EMAIL PROTECTED] wrote: Is query-basic or query-more included in your nutch-default.xml? It is indeed included in my nutch-site.xml :- property nameplugin.includes/name

Re: NullPointerException

2006-03-06 Thread Howie Wang
Hi, Hasan, Looking more carefully at the query-more plugin, it seems that it only adds functionality for date queries and type queries. I think you need to add query-basic to the list also to get it to search the default content. Can you try adding query-basic and running: bin/nutch search http

Re: NullPointerException

2006-03-05 Thread Stefan Groschupf
Hi, http or www are very good test queries. double check that the nutch-default.xml which inside the nutch.war points to the correct folder namesearcher.dir/name. Stefan Am 06.03.2006 um 02:31 schrieb Hasan Diwan: I've followed the nutch tutorial for crawling and started tomcat from the

Re: NullPointerException

2006-03-05 Thread Stefan Groschupf
If none are being fetched, something is definaltely wrong with your filter or url file. Yes, since it is blog it may has dynamic pages like foo.com?entry=23 this definitely filtered by default. - blog: http://www.find23.org company:

Re: NullPointerException

2006-03-05 Thread Hasan Diwan
Gentlemen: On 05/03/06, Richard Braman [EMAIL PROTECTED] wrote: This sounds like your crawl didn't get anything. I have seen that happen when the url wasn't added right, or the filter was bad. Pipe the crawl to crawl.log and look in there. It should show some pages being fecthed. If none

RE: NullPointerException

2006-03-05 Thread Richard Braman
It did fetch some urls: -Original Message- From: Jack Tang [mailto:[EMAIL PROTECTED] Sent: Sunday, March 05, 2006 9:35 PM To: nutch-user@lucene.apache.org Subject: Re: NullPointerException Hey Hasan Crawling seems ok. Can you pls try org.apache.nutch.searcher.NutchBean [your-query

Re: NullPointerException

2006-03-05 Thread Hasan Diwan
Mr Tang: Crawling seems ok. Can you pls try org.apache.nutch.searcher.NutchBean [your-query-string] in shell/cmd? server: 7:20pm % ./bin/nutch org.apache.nutch.searcher.NutchBean hasan 060305 192042 10 parsing file:/home/hdiwan/nutch-0.7.1/conf/nutch-default.xml 060305 192042 10 parsing

Re: NullPointerException

2006-03-05 Thread Hasan Diwan
Mr Tang: On 05/03/06, Jack Tang [EMAIL PROTECTED] wrote: Weird! You are running nutch on local file system or distributed file system? Local file system And can you find the same query hasan via luke? Nope -- Cheers, Hasan Diwan [EMAIL PROTECTED]

Re: NullPointerException

2006-03-05 Thread Jack Tang
On 3/6/06, Hasan Diwan [EMAIL PROTECTED] wrote: Mr Tang: On 05/03/06, Jack Tang [EMAIL PROTECTED] wrote: Weird! You are running nutch on local file system or distributed file system? Local file system And can you find the same query hasan via luke? Nope ok. As stepan said, can you get

Re: NullPointerException

2006-03-05 Thread Jack Tang
On 3/6/06, Hasan Diwan [EMAIL PROTECTED] wrote: On 05/03/06, Jack Tang [EMAIL PROTECTED] wrote: ok. As stepan said, can you get any hit when you try to search http or www? No Hey, can you zip the index and send it to me directly? -- Cheers, Hasan Diwan [EMAIL PROTECTED] -- Keep

Re: NullPointerException

2006-03-05 Thread Jack Tang
Hasan It seems your index is not completed. If you get whole(correct) indices, index dir should include 1. segements file 2. deletable file 3. other files I am not sure what's wrong in nutch-0.7.1 indexing, but now it is possible to upgrade to nutch 0.8(svn version)? /Jack On 3/6/06, Jack

Re: NullPointerException

2006-03-05 Thread Hasan Diwan
On 05/03/06, Jack Tang [EMAIL PROTECTED] wrote: I am not sure what's wrong in nutch-0.7.1 indexing, but now it is possible to upgrade to nutch 0.8(svn version)? It is possible, but I was under the assumption that 0.8 required NDFS? -- Cheers, Hasan Diwan [EMAIL PROTECTED]

Re: NullPointerException

2006-03-05 Thread Hasan Diwan
On 05/03/06, Jack Tang [EMAIL PROTECTED] wrote: You can still build it on local file system:) Build, yes, but what of deployment? Can I use it in the same way? At present, I don't have enough resources to run a distributed crawl. -- Cheers, Hasan Diwan [EMAIL PROTECTED]

Re: NullPointerException

2006-03-05 Thread Jack Tang
On 3/6/06, Hasan Diwan [EMAIL PROTECTED] wrote: On 05/03/06, Jack Tang [EMAIL PROTECTED] wrote: You can still build it on local file system:) Build, yes, but what of deployment? Can I use it in the same way? Of course yes. At present, I don't have enough resources to run a distributed

Re: NullPointerException

2006-03-05 Thread Hasan Diwan
Right then.. compiled the svn version of nutch. Tried running the crawl with it and this is the log: server: 11:32pm % ./bin/nutch crawl ../SpectraSearch/urls -dir ../SpectraSearch/crawl -depth 2 -threads 20 060305 233255 parsing