Hello Bertrand,
> 
> On 7/31/07, Ard Schrijvers <[EMAIL PROTECTED]> wrote:
> > Regarding your usecase, having around 36.000.000 documents 
> after one year .in one
> > single ws with terabytes of data...so 100.000.000 docs 
> within three years...Well, I
> > think you at least have to tune some settings :-)...
> 
> Just to make sure there's no misunderstanding, the original post says
> "nodes", not "documents".

Yes you are right! I must have misunderstood since he is talking about "pushing 
300-500 nodes a minute" so I understood he meant pushing docs in JR :-) 

> 
> So that's 36 million nodes a year, or 100 million after three years.
> If it was documents, it might be many more nodes than that.
> 
> Although I haven't run those tests myself, I've talked with people
> doing tests with, IIRC, 150 million nodes, and such quantities are
> also regularly mentioned in Lucene tests

Yes I agree, but in these cases you really have to understand how to tune and 
configure each seperate component, because for example, if you have a just 
invalidated indexReader, and you are doing a search on a common word with a 
sort on title, or some rangequery, you might run into problems with 150 million 
nodes. 

>, so I don't think this is
> necessarily a problem. But of course, it depends on how nodes are
> structured and on what's indexed.

Indexing seems to me pretty important when having 150 million nodes. Actually 
ATM I am sorting out the JackRabbit 1.4 release planned 
IndexingConfigurationImpl possibilities, which look very promising to me 
(though OTOH, people must know how to configure the indexing properly, and this 
might be a bit harsh in the beginning because you really have to know the 
content modelling structure AFAICS). 

But as I misunderstood the requirements regarding nodes, and you know people 
who have succesful tests with 150 million nodes...well, then I will stick to my 
remark that you need to know to tune some configuration parameters :-) 

Regards Ard

> 
> -Bertrand
> 

Reply via email to