Hi Lewis, I have created a jira [0] for this and uploaded the patch. [0] : https://issues.apache.org/jira/browse/NUTCH-1514
Thanks, Tejas On Sun, Jan 6, 2013 at 12:01 AM, Tejas Patil <[email protected]>wrote: > Hey Lewis, > > Yes. Thats a good idea. There are so many properties in nutch-default.xml > and having the deprecated ones adds to the confusion. > > Thanks, > Tejas Patil > > > On Sat, Jan 5, 2013 at 11:12 PM, Lewis John Mcgibbney < > [email protected]> wrote: > >> I think it would be good to phase out some of the deprecated configuration >> properties if possible. We have had several stable releases with these >> props included... >> Lewis >> On Jan 5, 2013 6:22 PM, "Tejas Patil" <[email protected]> wrote: >> >> > The generate.max.per.host is deprecated but still is used inside the >> > Generator logic. >> > In Generator.java: >> > >> > * if (maxCount==-1 && oldMaxPerHost!=-1){* >> > * maxCount = oldMaxPerHost;* >> > * byDomain = false;* >> > * }* >> > >> > ("generate.max.count" is stored in maxCount and "generate.max.per.host" >> is >> > stored in oldMaxPerHost.) >> > So despite of having "generate.max.count" as -1 in the config file, >> > internally it was using 100. >> > >> > Thanks, >> > Tejas Patil >> > >> > >> > On Sat, Jan 5, 2013 at 6:19 PM, Bayu Widyasanyata >> > <[email protected]>wrote: >> > >> > > Problem fixed :) >> > > >> > > Many thanks! >> > > >> > > On Sun, Jan 6, 2013 at 9:15 AM, Bayu Widyasanyata >> > > <[email protected]>wrote: >> > > >> > > > I think it was the problem, on my nutch-site.xml >> > > > >> > > > <property> >> > > > <name>generate.max.per.host</name> >> > > > <value>100</value> >> > > > </property> >> > > > >> > > > eventhough it's deprecated. >> > > > OK, I will remove it (on nutch-site.xml) and try to recrawl again. >> > > > >> > > > Thanks Tejas! >> > > > >> > > > >> > > > On Sun, Jan 6, 2013 at 8:59 AM, Tejas Patil < >> [email protected] >> > > >wrote: >> > > > >> > > >> What all properties have you set in nutch-site.xml ? >> > > >> >> > > >> Thanks, >> > > >> Tejas Patil >> > > >> >> > > >> >> > > >> On Sat, Jan 5, 2013 at 5:31 PM, Bayu Widyasanyata >> > > >> <[email protected]>wrote: >> > > >> >> > > >> > Hi, >> > > >> > >> > > >> > I got warn message on nutch: >> > > >> > >> > > >> > "Host or domain example.com has more than 100 URLs for all 1 >> > > segments. >> > > >> > Additional URLs won't be included in the fetchlist." >> > > >> > >> > > >> > Property of generate.max.count in nutch-default.xml is still >> default >> > > >> value >> > > >> > which is -1 (unlimited). >> > > >> > Why does this error is still appear? >> > > >> > >> > > >> > I use nutch 1.6 with Solr 4.0. >> > > >> > >> > > >> > Thanks, >> > > >> > >> > > >> > -- >> > > >> > wassalam, >> > > >> > [bayu] >> > > >> > >> > > >> >> > > > >> > > > >> > > > >> > > > -- >> > > > wassalam, >> > > > [bayu] >> > > >> > > >> > > >> > > >> > > -- >> > > wassalam, >> > > [bayu] >> > > >> > >> > >

