crawling a subfolder

2016-10-03 Thread Néstor
Hi, I am using nutch for the first time and when I crawl www.mysite.com it crawls for a while. When I try to crawl a subfolder like www.mysite.com/mysubfolder it crawls for about 1 sec. my ursl/seed.txt is set http://www.mysite.com/mysubfolder my regex-urlfilter.txt use the defautl except for the

Re: why the results have diff number of fields

2016-10-04 Thread Néstor
Maybe because I am trying to just crawl a subfolder mysite.com/subfolder and I am having problems configuring it to do this and is going and crawling other pages from the parent directory. Thanks! On Tue, Oct 4, 2016 at 4:00 AM, Markus Jelsma wrote: > Well, probably because you or something i

Re: Nutch 2.3.1

2016-10-10 Thread Néstor
Can you send it to me also? Thanks, Néstor On Oct 10, 2016 9:33 PM, "MrSrivastavaRK ." wrote: > > Hi, > I have successfully indexed content in Elasticsearch using Nutch 1.12 REST > API. I can send you api details, If you want for reference. > > Regards > Raje

nutch 1.7 solr 5.52 ubuntu

2016-10-14 Thread Néstor
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) *How can I make it crawl the entire subfolder?* *and What does that error means?* Thanks, Néstor -- Né§t☼r *Authority gone to one's head is the greatest enemy of Truth*