Thanks Robert for sharing.
Good to hear it is working for what you need it to do.

3) Especially with ReadOnlyIndexReaders, you should not be blocked while
indexing. Especially if you have multicore machines.
4) do you stay with sub-second responses with high thru-put?

-John

On Wed, Dec 3, 2008 at 11:03 PM, Robert Muir <[EMAIL PROTECTED]> wrote:

>
>
> On Thu, Dec 4, 2008 at 1:24 AM, John Wang <[EMAIL PROTECTED]> wrote:
>
>> Nice!
>> Some questions:
>>
>> 1) one index?
>>
> no, but two individual ones today were around 100M docs
>
>> 2) how big is your document? e.g. how many terms etc.
>>
> last one built has over 4M terms
>
>> 3) are you serving(searching) the docs in realtime?
>>
> i dont understand this question, but searching is slower if i am indexing
> on a disk thats also being searched.
>
>>
>> 4) search speed?
>>
> usually subsecond (or close) after some warmup. while this might seem slow
> its fast compared to the competition, trust me.
>
>>
>> I'd love to learn more about your architecture.
>>
> i hate to say you would be disappointed, but theres nothign fancy. probably
> why it works...
>
>>
>> -John
>>
>>
>> On Wed, Dec 3, 2008 at 10:13 PM, Robert Muir <[EMAIL PROTECTED]> wrote:
>>
>>> sorry gotta speak up on this. i indexed 300m docs today. I'm using an out
>>> of box jar.
>>>
>>> yeah i have some special subclasses but if i thought any of this stuff
>>> was general enough to be useful to others i'd submit it. I'm just happy to
>>> have something scalable that i can customize to my peculiarities.
>>>
>>> so i think i fit in your 10% and im not stressing on either scalability
>>> or api.
>>>
>>> thanks,
>>> robert
>>>
>>>
>>> On Thu, Dec 4, 2008 at 12:36 AM, John Wang <[EMAIL PROTECTED]> wrote:
>>>
>>>> Grant:
>>>>         I am sorry that I disagree with some points:
>>>>
>>>> 1) "I think it's a sign that Lucene is pretty stable." - While lucene is
>>>> a great project, especially with 2.x releases, great improvements are made,
>>>> but do we really have a clear picture on how lucene is being used and
>>>> deployed. While lucene works great running as a vanilla search library, 
>>>> when
>>>> pushed to limits, one needs to "hack" into lucene to make certain things
>>>> work. If 90% of the user base use it to build small indexes and using the
>>>> vanilla api, and the other 10% is really stressing both on the scalability
>>>> and api side and are running into issues, would you still say: "running 
>>>> well
>>>> for 90% of the users, therefore it is stable or extensible"? I think it is
>>>> unfair to the project itself to be measured by the vanilla use-case. I have
>>>> done couple of large deployments, e.g. >30 million documents indexed and
>>>> searched in realtime., and I really had to do some tweaking.
>>>>
>>>>
>>>
>>> --
>>> Robert Muir
>>> [EMAIL PROTECTED]
>>>
>>
>>
>
>
> --
> Robert Muir
> [EMAIL PROTECTED]
>

Reply via email to