On 12 July 2011 10:30, Julien Nioche <[email protected]> wrote:

>
>
>> > > There seems to be no crawl-urlfilter file indeed. Don't know why it's
>> > > gone since
>> > > the crawl command is still there. You can find the file in the 1.2
>> > > release: http://svn.apache.org/viewvc/nutch/branches/branch-1.2/conf/
>> >
>> > Crawl-urlfilter has been removed  purposefully as it did not add
>> anything
>> > to the other url filters (automaton | regex) in terms of functionality.
>> By
>> > default the urlfilters contain (+.) which IIRC was what the
>> > Crawl-urlfilter used to do.
>> >
>>
>> That's reasonable. But now news users are unaware and don't know what to
>> do
>> with this error message.
>>
>
> Yep, the tutorial needs updating indeed
>

done


>
>
>
>>
>> > > > Thanks for a quick reply.
>> > > >
>> > > > I searched in the nutch directory but still do not see that file :(.
>> > >
>> > > Here's
>> > >
>> > > > complete file list inside runtime/local/conf directory.
>> > > >
>> > > > us137390:conf parampreetsethi$ pwd
>> > > > /Users/parampreetsethi/Documents/workspace/nutch/runtime/local/conf
>> > > > us137390:conf parampreetsethi$ ls -t
>> > > > automaton-urlfilter.txt    domain-urlfilter.txt    nutch-default.xml
>> > > > prefix-urlfilter.txt    solrindex-mapping.xml
>> > > > configuration.xsl    httpclient-auth.xml    nutch-site.xml
>> > > > regex-normalize.xml    subcollections.xml
>> > > > domain-suffixes.xml    log4j.properties    parse-plugins.dtd
>> > > > regex-urlfilter.txt    suffix-urlfilter.txt
>> > > > domain-suffixes.xsd    nutch-conf.xsl        parse-plugins.xml
>> > > > schema.xml tika-mimetypes.xml
>> > > >
>> > > > By the way, I tried deploying the code by checking out from svn
>> > >
>> > > repository,
>> > >
>> > > > but could not build it. I was getting following error:
>> > > >
>> > > > resolve-default:
>> > >
>> > > > [ivy:resolve] :: Ivy 2.2.0 - 20100923230623 ::
>> > > http://ant.apache.org/ivy/
>> > >
>> > > > :: [ivy:resolve] :: loading settings :: file =
>> > > >
>> > > > /Users/parampreetsethi/Documents/workspace/nutch/ivy/ivysettings.xml
>> > > > [ivy:resolve]
>> > > > [ivy:resolve] :: problems summary ::
>> > > > [ivy:resolve] :::: WARNINGS
>> > > > [ivy:resolve]         module not found:
>> > > > org.apache.gora#gora-core;0.2-incubating
>> > > > [ivy:resolve]     ==== local: tried
>> > > > [ivy:resolve]
>> > >
>> > >
>> /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-core/0.2-incubati
>> > > ng
>> > >
>> > > > / ivys/ivy.xml
>> > > > [ivy:resolve]       -- artifact
>> > > > org.apache.gora#gora-core;0.2-incubating!gora-core.jar:
>> > > > [ivy:resolve]
>> > >
>> > >
>> /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-core/0.2-incubati
>> > > ng
>> > >
>> > > > / jars/gora-core.jar
>> > > > [ivy:resolve]         module not found:
>> > > > org.apache.gora#gora-sql;0.2-incubating
>> > > > [ivy:resolve]     ==== local: tried
>> > > > [ivy:resolve]
>> > >
>> > >
>> /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-sql/0.2-incubatin
>> > > g/
>> > >
>> > > > i vys/ivy.xml
>> > > > [ivy:resolve]       -- artifact
>> > > > org.apache.gora#gora-sql;0.2-incubating!gora-sql.jar:
>> > > > [ivy:resolve]
>> > >
>> > >
>> /Users/parampreetsethi/.ivy2/local/org.apache.gora/gora-sql/0.2-incubatin
>> > > g/
>> > >
>> > > > j ars/gora-sql.jar
>> > > > [ivy:resolve]         ::::::::::::::::::::::::::::::::::::::::::::::
>> > > > [ivy:resolve]         ::          UNRESOLVED DEPENDENCIES         ::
>> > > > [ivy:resolve]         ::::::::::::::::::::::::::::::::::::::::::::::
>> > > > [ivy:resolve]         :: org.apache.gora#gora-core;0.2-incubating:
>> not
>> > > > found [ivy:resolve]         ::
>> org.apache.gora#gora-sql;0.2-incubating:
>> > > > not found [ivy:resolve]
>> > > >
>> > > > :::::::::::::::::::::::::::::::::::::::::::::: [ivy:resolve]
>> > > >
>> > > > [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
>> > > >
>> > > > BUILD FAILED
>> > >
>> > > > /Users/parampreetsethi/Documents/workspace/nutch/build.xml:458:
>> > > impossible
>> > >
>> > > > to resolve dependencies:
>> > > >     resolve failed - see output for details
>> > > >
>> > > > -param
>> > > >
>> > > > On 7/11/11 5:56 PM, "Jerry E. Craig, Jr." <[email protected]>
>> > >
>> > > wrote:
>> > > > > Look down a little further for the
>> > > > >
>> > > > > or
>> > > > > runtime/local/bin/nutch (version >= 1.3)
>> > > > >
>> > > > > If you download the bin then it's in the runtime directory.
>> > > > >
>> > > > > Jerry E. Craig, Jr.
>> > > > >
>> > > > > -----Original Message-----
>> > > > > From: Sethi, Parampreet [mailto:[email protected]]
>> > > > > Sent: Monday, July 11, 2011 2:51 PM
>> > > > > To: [email protected]
>> > > > > Subject: Nutch Novice help
>> > > > >
>> > > > > Hi All,
>> > > > >
>> > > > > Sorry for such a naïve question,  I downloaded nutch 1.3 binary
>> today
>> > >
>> > > and
>> > >
>> > > > > trying to set it up as mentioned in Tutorial at
>> > > > > http://wiki.apache.org/nutch/NutchTutorial
>> > > > >
>> > > > > How ever I am not able to find crawl-urlfilter.txt inside conf
>> > >
>> > > directory.
>> > >
>> > > > > Is there any other place where I should look for this file?
>> > > > >
>> > > > > Thanks
>> > > > > Param
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to