Re: Multisearcher will maintain index order sorting?

Hadi Forghani Thu, 23 Oct 2008 02:56:07 -0700

because when you want to find X of second index, shoud pass docId=3 to
MultiSearcher and MultiSearcher can Find Sub Search of this Document with
calculation length of all subSearcher.
for example when you get doc with DocID 3(Second X), multisearch (see the
code of multisearcher doc(int i)), mines 3 from your DocID(because the first
Searcher has 3 documents) and then pass zero to second Searcher and want to
return 0 doc from it.
on the other hand, multisearcher find subsearcher by BinarySearchTree no
just that is said.


On Thu, Oct 23, 2008 at 12:47 PM, Ganesh <[EMAIL PROTECTED]> wrote:

> In IndexA there are 3 docs
> DocID, Terms
> 0,X
> 1,X Y
> 2,X Z
>
> In IndexB there are 3 docs
> DocID, Terms
> 0,X
> 1,X Y
> 2,X Z
>
> When i do sort on indexed order using Multisearcher and
> ParallelMultiSearcher, it returns the result
> 0,X
> 3,X
> 1,X Y
> 4,X Y
> 2,X Z
> 5,X Z
>
> But it should be in the order of 0,1,2,3,4,5. Could anyone explain why?
>
> Regards
> Ganesh
>
> ----- Original Message ----- From: "Ganesh" <[EMAIL PROTECTED]>
> To: <[email protected]>
> Sent: Thursday, October 23, 2008 1:37 PM
>
> Subject: Re: Multisearcher will maintain index order sorting?
>
>
>  Multisearcher and ParallelMultiSearcher, when requested to sort on doc
>> (indexed order), it merges the result by docID of each DB.
>>
>> Regards
>> Ganesh
>>
>> ----- Original Message ----- From: "Paul Smith" <[EMAIL PROTECTED]>
>> To: <[email protected]>
>> Sent: Thursday, October 23, 2008 10:57 AM
>> Subject: Re: Multisearcher will maintain index order sorting?
>>
>>
>>
>>> On 23/10/2008, at 4:20 PM, Ganesh wrote:
>>>
>>>  My Index DB is having 10 million records and it will grow to 30 million.
>>>> Currently I am using millisecond timestamp and the RAM cosumption is more. 
>>>> I
>>>> will change the resolution to minute. I am  using 2 searcher objects
>>>> refreshing each other every minute. When i  do a warmup query with sort of
>>>> timestamp then the cpu is spiked to  100% and this is happening for every
>>>> minute.  In order to avoid  these issues, i am planning to break my DB and
>>>> to do sort on indexed  order.
>>>>
>>>> Will multisearcher will maintain indexed order on sorting?
>>>>
>>>
>>>
>>> If you need to keep the millisecond accuracy, break down the timestamp
>>> into 3 fields: day, time, millisecond, and sort on 3 fields.  This way each
>>> field has a much smaller number of distinct values and well  occupy vastly
>>> less memory over time.  I don't think there's much  overhead in this
>>> approach either, because in most cases, the top-level  field (day) will
>>> provide most of the sorting ability, and Lucene will  only need to hit the
>>> time & millisecond fields less frequently for  comparison.
>>>
>>> I believe Multisearcher does a merge sort of the 2 (or more) sub-
>>> searchers, so there is an overhead in using in versus a single searcher.
>>>
>>> Paul
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>> Send instant messages to your online friends
>> http://in.messenger.yahoo.com
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Multisearcher will maintain index order sorting?

Reply via email to