I want to know the schedule of fetcher which may be the graph theory?
On Tue, Nov 2, 2010 at 8:30 PM, Markus Jelsma wrote:
> Hello Dennis,
>
> How's it going?
>
> Cheers,
>
> On Monday 17 May 2010 03:27:58 Dennis Kubes wrote:
> > Hi Everyone,
> >
> > It has been a long time coming but I have fin
Writing plugins is one of the most important & something on which not so
many comprehending tutorials are available .we also
doesn't have any video tutorial for any of them .also if you add nutch
+hadoop that will be very cool .
I will be available for any help.
On Tue, Nov 2, 2010 at 8:30 AM,
Hello Dennis,
How's it going?
Cheers,
On Monday 17 May 2010 03:27:58 Dennis Kubes wrote:
> Hi Everyone,
>
> It has been a long time coming but I have finally started to write a
> book on Nutch. It will be self published and should be available in PDF
> / paperback form in less than a month ho
I need to know how to be able to integrate nutch with solr
and also track the index time of an article on solr then sort...if you book
can use such a case study you got my buy on that
Mambe Churchill Nanje
237 77545907,
AfroVisioN Founder, President,CEO
http://www.afrovisiongroup.com | http://mam
I wanted to thank everyone for all the great responses. It really helps
in putting together information that will be useful to everyone.
I am in also process of launching a blog about nutch/hadoop too and am
working to get the first post (with video) done and up. I will update
the list when
,
Arkadi
> -Original Message-
> From: Dennis Kubes [mailto:ku...@apache.org]
> Sent: Monday, May 17, 2010 11:28 AM
> To: user@nutch.apache.org
> Subject: Writing a Book on Nutch
>
> Hi Everyone,
>
> It has been a long time coming but I have finally started
Second this. Best practice in a production system, how to keep
re-crawling without bloating the whole system.
On 5/17/2010 3:40 AM, Piet van Remortel wrote:
re-crawling and controlling that process seems like an issue in need of
covering to me
Thanks
Piet
Belgium
On Mon, May 17, 2010 at 9:3
Hi,
> "re-crawling and controlling that process seems like an issue in need of
> covering to me"
>
> I am also very interested in knowing that better ..
> But also better strategies for crawling a single site and some benchmarks,
> linking configuration to performance.
+1 for information on bench
I'd like to second this- ties in to hadoop and other ways to analyze your index
are a big mystery to me when dealing with nutch!
- Original Message
From: Alex Basa
To: user@nutch.apache.org
Sent: Sun, May 16, 2010 9:18:01 PM
Subject: Re: Writing a Book on Nutch
Dennis,
One
t; > * Aging of your Nutch segments. When do you really need to blow away
> > everything and start from scratch.
> > * How do you recover from an interrupted / crashed spider / index run
> that
> > took days or weeks to run (so you don't want to "just start over"
IDEA-ENG / Cell: 408-829-6513
>
>
> On Sun, May 16, 2010 at 9:18 PM, Alex Basa wrote:
>
>> Dennis,
>>
>> One topic that had taken me a long time to figure out and lots of people
>> have been having issues with is doing an incremental index. I don't think
>>
Hey,
On Mon, May 17, 2010 at 04:27, Dennis Kubes wrote:
> Hi Everyone,
>
> It has been a long time coming but I have finally started to write a book
> on Nutch. It will be self published and should be available in PDF /
> paperback form in less than a month hopefully.
>
> A while back we discus
I would like one chapter on how to configure Nutch for focus crawling.. best
practices and strategies... especially to avoid host-blocking.
On Mon, May 17, 2010 at 6:57 AM, Dennis Kubes wrote:
> Hi Everyone,
>
> It has been a long time coming but I have finally started to write a book
> on Nutch
think
> it was documented anywhere and it would be great if you could cover it.
>
> Thanks,
>
> Alex
>
> --- On Sun, 5/16/10, Dennis Kubes wrote:
>
> > From: Dennis Kubes
> > Subject: Writing a Book on Nutch
> > To: user@nutch.apache.org
> > Date:
re-crawling and controlling that process seems like an issue in need of
covering to me
Thanks
Piet
Belgium
On Mon, May 17, 2010 at 9:32 AM, Alexander Aristov <
alexander.aris...@gmail.com> wrote:
> I would definetely want to see answers on questions about distributed
> search.
>
> Starting from
I would definetely want to see answers on questions about distributed
search.
Starting from crawling, - how to make it in distributed mode, where to store
collected pages and indexes
and ending questions about relevancy of results abtained from different
search servers.
Best Regards
Alexander Ar
gt; From: Dennis Kubes
> Subject: Writing a Book on Nutch
> To: user@nutch.apache.org
> Date: Sunday, May 16, 2010, 8:27 PM
> Hi Everyone,
>
> It has been a long time coming but I have finally started
> to write a book on Nutch. It will be self published
> and should be avai
Hi Everyone,
It has been a long time coming but I have finally started to write a
book on Nutch. It will be self published and should be available in PDF
/ paperback form in less than a month hopefully.
A while back we discussed a Nutch training seminar on the list. I am
not ready to do a
18 matches
Mail list logo