On Thu, Dec 4, 2008 at 1:24 AM, John Wang <[EMAIL PROTECTED]> wrote: > Nice! > Some questions: > > 1) one index? > no, but two individual ones today were around 100M docs
> 2) how big is your document? e.g. how many terms etc. > last one built has over 4M terms > 3) are you serving(searching) the docs in realtime? > i dont understand this question, but searching is slower if i am indexing on a disk thats also being searched. > > 4) search speed? > usually subsecond (or close) after some warmup. while this might seem slow its fast compared to the competition, trust me. > > I'd love to learn more about your architecture. > i hate to say you would be disappointed, but theres nothign fancy. probably why it works... > > -John > > > On Wed, Dec 3, 2008 at 10:13 PM, Robert Muir <[EMAIL PROTECTED]> wrote: > >> sorry gotta speak up on this. i indexed 300m docs today. I'm using an out >> of box jar. >> >> yeah i have some special subclasses but if i thought any of this stuff was >> general enough to be useful to others i'd submit it. I'm just happy to have >> something scalable that i can customize to my peculiarities. >> >> so i think i fit in your 10% and im not stressing on either scalability or >> api. >> >> thanks, >> robert >> >> >> On Thu, Dec 4, 2008 at 12:36 AM, John Wang <[EMAIL PROTECTED]> wrote: >> >>> Grant: >>> I am sorry that I disagree with some points: >>> >>> 1) "I think it's a sign that Lucene is pretty stable." - While lucene is >>> a great project, especially with 2.x releases, great improvements are made, >>> but do we really have a clear picture on how lucene is being used and >>> deployed. While lucene works great running as a vanilla search library, when >>> pushed to limits, one needs to "hack" into lucene to make certain things >>> work. If 90% of the user base use it to build small indexes and using the >>> vanilla api, and the other 10% is really stressing both on the scalability >>> and api side and are running into issues, would you still say: "running well >>> for 90% of the users, therefore it is stable or extensible"? I think it is >>> unfair to the project itself to be measured by the vanilla use-case. I have >>> done couple of large deployments, e.g. >30 million documents indexed and >>> searched in realtime., and I really had to do some tweaking. >>> >>> >> >> -- >> Robert Muir >> [EMAIL PROTECTED] >> > > -- Robert Muir [EMAIL PROTECTED]