Re: Nutch Exception

2008-05-13 Thread Vineet Garg
Vineet Garg wrote: I tried. But it is throwing same exceptions as before. [EMAIL PROTECTED] wrote: This one: Path oPath = new Path(/hm/vineetg/nutch-0.9/crawl_test); Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vineet Garg [EMAIL

Re: Nutch Exception

2008-05-11 Thread Vineet Garg
-- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vineet Garg [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Friday, May 9, 2008 5:07:32 AM Subject: Re: Nutch Exception I have resolved this problem also. Now it is throwing following error: Exception

Re: Nutch Exception

2008-05-11 Thread Vineet Garg
I tried. But it is throwing same exceptions as before. [EMAIL PROTECTED] wrote: This one: Path oPath = new Path(/hm/vineetg/nutch-0.9/crawl_test); Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vineet Garg [EMAIL PROTECTED] To: nutch

Re: Nutch Exception

2008-05-09 Thread Vineet Garg
) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91) at org.apache.nutch.searcher.IndexSearcher.init(IndexSearcher.java:70) at Test.main(Test.java:12) Am i missing something?? Vineet Garg wrote: Hi, Yes there was some problem with Java installation. After resolving

Nutch Exception

2008-05-07 Thread Vineet Garg
/crawl_test. Does anybody have any idea why it is throwing this exception?? Regards, Vineet Garg

Re: Nutch Exception

2008-05-07 Thread Vineet Garg
a JVM/CLASSPATH problem... Which JVM are you using, is your CLASSPATH correct, what's in JAVA_HOME? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vineet Garg [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Wednesday, May 7, 2008 2

Nutch books

2008-05-05 Thread Vineet Garg
Hi, Is there any book or documentation on How to use nutch API?? Regards, Vineet Garg

Re: Nutch API and Lucene API are same?

2008-05-04 Thread Vineet Garg
of Lucene that Nutch is using. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vineet Garg [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Friday, May 2, 2008 1:17:31 PM Subject: Nutch API and Lucene API are same? Hi, I have crawled

Nutch API and Lucene API are same?

2008-05-02 Thread Vineet Garg
on files indexed by nutch 0.9? Please reply. Regards, Vineet Garg

Problems with nutch

2008-04-10 Thread Vineet Garg
.gif files. Do i have to include some parser for this? if yes how to include. Does anybody know its solution?? Regards, Vineet Garg

Re: Nutch fetching skipped files

2008-04-04 Thread Vineet Garg
: Find my reply inline. On Wed, Apr 2, 2008 at 5:04 PM, Vineet Garg [EMAIL PROTECTED] wrote: Hi, I am using Nutch to crawl local file system. I am crawling by bin/nutch crawl urls -dir crawl -depth 5 -topN 500 crawl.log. But nutch is fetching files e.g. .css or .png files which i have set

Re: Nutch fetching skipped files

2008-04-04 Thread Vineet Garg
I have tried that but it does not work.. [EMAIL PROTECTED] wrote: Hello Vinet, Try using regex-urlfilter instead of crawl-urlfilter. Regards, Arkadi -Original Message- From: Vineet Garg [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 02, 2008 10:34 PM To: nutch-user

description of db.ignore.internal.links property

2008-04-02 Thread Vineet Garg
explain it? Regards, Vineet Garg

Re: Code to be modified

2008-04-02 Thread Vineet Garg
you use the crawl command? In this case crawl-urlfilter.txt is used, not regex-urlfilter.txt What kind of errors are generated? Please post them as well . . . . best regards, martin On Mon, Mar 31, 2008 at 8:39 AM, Vineet Garg [EMAIL PROTECTED] wrote: Hi I configured the regex

Nutch fetching skipped files

2008-04-02 Thread Vineet Garg
Hi, I am using Nutch to crawl local file system. I am crawling by bin/nutch crawl urls -dir crawl -depth 5 -topN 500 crawl.log. But nutch is fetching files e.g. .css or .png files which i have set to be skipped in crawl-urlfilter.txt file and throwing error while parsing: fetching

Re: Code to be modified

2008-03-31 Thread Vineet Garg
parent directories for file protocol - misconfigured URLFiltershttp://wiki.apache.org/nutch/FAQ#head-f64e7589b2f12792d6d781f3db23840a8f3a1e10 You can achieve the desired behaviour by adjusting your regexes. Hope it helps, martin On Fri, Mar 28, 2008 at 12:32 PM, Vineet Garg [EMAIL PROTECTED

Code to be modified

2008-03-28 Thread Vineet Garg
can i modify this code? Vineet Garg