On Wed, Feb 24, 2010 at 03:42:20PM +0200, Sami Siren wrote:
> Hannu,
>
> Do you use same set of QueryFilters both in the webapp and when
> running from shell?
>
> Perhaps your filter is not executed when running from cli? You can
> verify how your query is transformed by running bin/nutch
> org.a
Andrzej Bialecki wrote:
>
> I was involved in a project to implement this (as a proprietary plugin).
> ...
> So, if you target 10 sites, you can make it work. If you target 10,000
> sites all using slightly different methods, then forget it.
>
>
> --
> Best regards,
> Andrzej Bialecki <
On 2010-03-11 15:53, nikinch wrote:
Hi everyone
I've been using nutch for a while now and i've come up on a snag.
I'm trying to find where new linked pages are added to the segment as a
specific entry.
To make myself clear i've been through the fetch class and the crawlDBFilter
and reducer.
B
On Thu, Mar 11, 2010 at 8:24 PM, Graziano Aliberti
wrote:
> Hi everyone,
>
> I'm trying to use nutch ver. 1.0 on a system under squid proxy control. When
> I try to fetch my website list, into the log file I see that the
> authentication was failed...
>
> I've configured my nutch-site.xml file wit
Hi everyone,
I'm trying to use nutch ver. 1.0 on a system under squid proxy control.
When I try to fetch my website list, into the log file I see that the
authentication was failed...
I've configured my nutch-site.xml file with all that properties needed
for proxy auth, but my error is "http
Hi everyone
I've been using nutch for a while now and i've come up on a snag.
I'm trying to find where new linked pages are added to the segment as a
specific entry.
To make myself clear i've been through the fetch class and the crawlDBFilter
and reducer.
But i'm looking for the initial entry w
Hi everyone
Not sure where exactly where to post this question. Sorry for the double
post.
I've been using nutch for a while now and i've come up on a snag.
I'm trying to find where new linked pages are added to the segment as a
specific entry.
To make myself clear i've been through the fetch c