Hi Stefan

Thanks for your mail

What i would like to know is (since i am using nutch-0.7) ,what is the upper
limit on the webdb size if any such limit exists in nutch-0.7

Will the generate for a web db formed from one TB of data (just an example)
work ?

And what is the  difference between webdb and nutch-0.8 (crawldb and linkdb)
which makes it infinitely possible in nutch-0.8?

Rgds
Prabhu



On 1/30/06, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
>
> You can already use ndfs in 0.7, however in case the webdb is to
> lareg it took to much time to generate segments.
> So the problem is the webdb size, not the hdd limit.
>
> Am 30.01.2006 um 07:31 schrieb Raghavendra Prabhu:
>
> > Hi Stefan
> >
> > So can i assume that hard disk space is the only constraint in
> > nutch-0.7
> >
> > In nutch-0.8 since you can store it over the ndfs , it is
> > theoretically
> > unlimited .
> >
> > Is my above mentioned point true ( In a nutshell , i want to know
> > whether
> > the only thing is the space for storing the nutch indexed date)
> >
> > I will try to do some testing and if possible contribute to wiki.
> >
> > Rgds
> >
> > Prabhu
> >
> >
> >
> > On 1/30/06, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> >>
> >> Any performance testing contribution to the wiki is welcome, I
> >> guess. :)
> >> So there are no such values except of some statements regarding
> >> search speed in the wiki,
> >> With nutch .8 there theoretically no size limit any more.
> >> Stefan
> >> Am 29.01.2006 um 13:35 schrieb Raghavendra Prabhu:
> >>
> >>> Is there any benchmark on how nutch performs
> >>>
> >>> I mean say like 1 GB of data is given as the input , how much time
> >>> it will
> >>> take to index this data on a 10 Mb/s network
> >>>
> >>> And while doing crawl what is the volume of data which can be
> >>> loaded (this
> >>> is in terms of search  . How much can a crawl segment hold )
> >>>
> >>> Is there any performance limit to it . Is the criteria only the
> >>> space to
> >>> store the indexed data .Is there any limit to it ?
> >>>
> >>> Rgds
> >>>
> >>> Prabhu
> >>
> >> ---------------------------------------------------------------
> >> company:        http://www.media-style.com
> >> forum:        http://www.text-mining.org
> >> blog:            http://www.find23.net
> >>
> >>
> >>
> >>
>
>

Reply via email to