Hello Bertrand, > > On 7/31/07, Ard Schrijvers <[EMAIL PROTECTED]> wrote: > > Regarding your usecase, having around 36.000.000 documents > after one year .in one > > single ws with terabytes of data...so 100.000.000 docs > within three years...Well, I > > think you at least have to tune some settings :-)... > > Just to make sure there's no misunderstanding, the original post says > "nodes", not "documents".
Yes you are right! I must have misunderstood since he is talking about "pushing 300-500 nodes a minute" so I understood he meant pushing docs in JR :-) > > So that's 36 million nodes a year, or 100 million after three years. > If it was documents, it might be many more nodes than that. > > Although I haven't run those tests myself, I've talked with people > doing tests with, IIRC, 150 million nodes, and such quantities are > also regularly mentioned in Lucene tests Yes I agree, but in these cases you really have to understand how to tune and configure each seperate component, because for example, if you have a just invalidated indexReader, and you are doing a search on a common word with a sort on title, or some rangequery, you might run into problems with 150 million nodes. >, so I don't think this is > necessarily a problem. But of course, it depends on how nodes are > structured and on what's indexed. Indexing seems to me pretty important when having 150 million nodes. Actually ATM I am sorting out the JackRabbit 1.4 release planned IndexingConfigurationImpl possibilities, which look very promising to me (though OTOH, people must know how to configure the indexing properly, and this might be a bit harsh in the beginning because you really have to know the content modelling structure AFAICS). But as I misunderstood the requirements regarding nodes, and you know people who have succesful tests with 150 million nodes...well, then I will stick to my remark that you need to know to tune some configuration parameters :-) Regards Ard > > -Bertrand >
