Hi Lewis,
I have created a jira [0] for this and uploaded the patch.

[0] : https://issues.apache.org/jira/browse/NUTCH-1514

Thanks,
Tejas


On Sun, Jan 6, 2013 at 12:01 AM, Tejas Patil <[email protected]>wrote:

> Hey Lewis,
>
> Yes. Thats a good idea. There are so many properties in nutch-default.xml
> and having the deprecated ones adds to the confusion.
>
> Thanks,
> Tejas Patil
>
>
> On Sat, Jan 5, 2013 at 11:12 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
>> I think it would be good to phase out some of the deprecated configuration
>> properties if possible. We have had several stable releases with these
>> props included...
>> Lewis
>> On Jan 5, 2013 6:22 PM, "Tejas Patil" <[email protected]> wrote:
>>
>> > The generate.max.per.host is deprecated but still is used inside the
>> > Generator logic.
>> > In Generator.java:
>> >
>> > *      if (maxCount==-1 && oldMaxPerHost!=-1){*
>> > *        maxCount = oldMaxPerHost;*
>> > *        byDomain = false;*
>> > *      }*
>> >
>> > ("generate.max.count" is stored in maxCount and "generate.max.per.host"
>> is
>> > stored in oldMaxPerHost.)
>> > So despite of having "generate.max.count" as -1 in the config file,
>> > internally it was using 100.
>> >
>> > Thanks,
>> > Tejas Patil
>> >
>> >
>> > On Sat, Jan 5, 2013 at 6:19 PM, Bayu Widyasanyata
>> > <[email protected]>wrote:
>> >
>> > > Problem fixed :)
>> > >
>> > > Many thanks!
>> > >
>> > > On Sun, Jan 6, 2013 at 9:15 AM, Bayu Widyasanyata
>> > > <[email protected]>wrote:
>> > >
>> > > > I think it was the problem, on my nutch-site.xml
>> > > >
>> > > >    <property>
>> > > >        <name>generate.max.per.host</name>
>> > > >        <value>100</value>
>> > > >    </property>
>> > > >
>> > > > eventhough it's deprecated.
>> > > > OK, I will remove it (on nutch-site.xml) and try to recrawl again.
>> > > >
>> > > > Thanks Tejas!
>> > > >
>> > > >
>> > > > On Sun, Jan 6, 2013 at 8:59 AM, Tejas Patil <
>> [email protected]
>> > > >wrote:
>> > > >
>> > > >> What all properties have you set in nutch-site.xml ?
>> > > >>
>> > > >> Thanks,
>> > > >> Tejas Patil
>> > > >>
>> > > >>
>> > > >> On Sat, Jan 5, 2013 at 5:31 PM, Bayu Widyasanyata
>> > > >> <[email protected]>wrote:
>> > > >>
>> > > >> > Hi,
>> > > >> >
>> > > >> > I got warn message on nutch:
>> > > >> >
>> > > >> > "Host or domain example.com has more than 100 URLs for all 1
>> > > segments.
>> > > >> > Additional URLs won't be included in the fetchlist."
>> > > >> >
>> > > >> > Property of generate.max.count in nutch-default.xml is still
>> default
>> > > >> value
>> > > >> > which is -1 (unlimited).
>> > > >> > Why does this error is still appear?
>> > > >> >
>> > > >> > I use nutch 1.6 with Solr 4.0.
>> > > >> >
>> > > >> > Thanks,
>> > > >> >
>> > > >> > --
>> > > >> > wassalam,
>> > > >> > [bayu]
>> > > >> >
>> > > >>
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > wassalam,
>> > > > [bayu]
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > wassalam,
>> > > [bayu]
>> > >
>> >
>>
>
>

Reply via email to