> > There seems to be no crawl-urlfilter file indeed. Don't know why it's > > gone since > > the crawl command is still there. You can find the file in the 1.2 > > release: http://svn.apache.org/viewvc/nutch/branches/branch-1.2/conf/ > > Crawl-urlfilter has been removed purposefully as it did not add anything > to the other url filters (automaton | regex) in terms of functionality. By > default the urlfilters contain (+.) which IIRC was what the > Crawl-urlfilter used to do. >
That's reasonable. But now news users are unaware and don't know what to do with this error message. > > > Thanks for a quick reply. > > > > > > I searched in the nutch directory but still do not see that file :(. > > > > Here's > > > > > complete file list inside runtime/local/conf directory. > > > > > > us137390:conf parampreetsethi$ pwd > > > /Users/parampreetsethi/Documents/workspace/nutch/runtime/local/conf > > > us137390:conf parampreetsethi$ ls -t > > > automaton-urlfilter.txt domain-urlfilter.txt nutch-default.xml > > > prefix-urlfilter.txt solrindex-mapping.xml > > > configuration.xsl httpclient-auth.xml nutch-site.xml > > > regex-normalize.xml subcollections.xml > > > domain-suffixes.xml log4j.properties parse-plugins.dtd > > > regex-urlfilter.txt suffix-urlfilter.txt > > > domain-suffixes.xsd nutch-conf.xsl parse-plugins.xml > > > schema.xml tika-mimetypes.xml > > > > > > By the way, I tried deploying the code by checking out from svn > > > > repository, > > > > > but could not build it. I was getting following error: > > > > > > resolve-default: > > > > > [ivy:resolve] :: Ivy 2.2.0 - 20100923230623 :: > > http://ant.apache.org/ivy/ > > > > > :: [ivy:resolve] :: loading settings :: file = > > > > > > /Users/parampreetsethi/Documents/workspace/nutch/ivy/ivysettings.xml > > > [ivy:resolve] > > > [ivy:resolve] :: problems summary :: > > > [ivy:resolve] :::: WARNINGS > > > [ivy:resolve] module not found: > > > org.apache.gora#gora-core;0.2-incubating > > > [ivy:resolve] ==== local: tried > > > [ivy:resolve] > > > > /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-core/0.2-incubati > > ng > > > > > / ivys/ivy.xml > > > [ivy:resolve] -- artifact > > > org.apache.gora#gora-core;0.2-incubating!gora-core.jar: > > > [ivy:resolve] > > > > /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-core/0.2-incubati > > ng > > > > > / jars/gora-core.jar > > > [ivy:resolve] module not found: > > > org.apache.gora#gora-sql;0.2-incubating > > > [ivy:resolve] ==== local: tried > > > [ivy:resolve] > > > > /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-sql/0.2-incubatin > > g/ > > > > > i vys/ivy.xml > > > [ivy:resolve] -- artifact > > > org.apache.gora#gora-sql;0.2-incubating!gora-sql.jar: > > > [ivy:resolve] > > > > /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-sql/0.2-incubatin > > g/ > > > > > j ars/gora-sql.jar > > > [ivy:resolve] :::::::::::::::::::::::::::::::::::::::::::::: > > > [ivy:resolve] :: UNRESOLVED DEPENDENCIES :: > > > [ivy:resolve] :::::::::::::::::::::::::::::::::::::::::::::: > > > [ivy:resolve] :: org.apache.gora#gora-core;0.2-incubating: not > > > found [ivy:resolve] :: org.apache.gora#gora-sql;0.2-incubating: > > > not found [ivy:resolve] > > > > > > :::::::::::::::::::::::::::::::::::::::::::::: [ivy:resolve] > > > > > > [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS > > > > > > BUILD FAILED > > > > > /Users/parampreetsethi/Documents/workspace/nutch/build.xml:458: > > impossible > > > > > to resolve dependencies: > > > resolve failed - see output for details > > > > > > -param > > > > > > On 7/11/11 5:56 PM, "Jerry E. Craig, Jr." <[email protected]> > > > > wrote: > > > > Look down a little further for the > > > > > > > > or > > > > runtime/local/bin/nutch (version >= 1.3) > > > > > > > > If you download the bin then it's in the runtime directory. > > > > > > > > Jerry E. Craig, Jr. > > > > > > > > -----Original Message----- > > > > From: Sethi, Parampreet [mailto:[email protected]] > > > > Sent: Monday, July 11, 2011 2:51 PM > > > > To: [email protected] > > > > Subject: Nutch Novice help > > > > > > > > Hi All, > > > > > > > > Sorry for such a naïve question, I downloaded nutch 1.3 binary today > > > > and > > > > > > trying to set it up as mentioned in Tutorial at > > > > http://wiki.apache.org/nutch/NutchTutorial > > > > > > > > How ever I am not able to find crawl-urlfilter.txt inside conf > > > > directory. > > > > > > Is there any other place where I should look for this file? > > > > > > > > Thanks > > > > Param

