On Thu, Feb 11, 2010 at 08:30:14AM -0500, Michael McCandless wrote:
> Oh you're saying we don't know if the underlying enum actually skipped vs
> just scanned?
Yep.
> Isn't the skip data also based on deltas?
Yes, but that's internal to the skip reader, in both Lucene and Lucy/KS. When
it com
On Wed, Feb 10, 2010 at 2:42 PM, Marvin Humphrey wrote:
> On Wed, Feb 10, 2010 at 12:33:27PM -0500, Michael McCandless wrote:
>
>> In Lucene, skipping is done through the aggregator,
>
> I had a look at MultiDocsEnum in the flex blanch. It doesn't know when
> sub-enum is reading skip data.
I'm c
On Wed, Feb 10, 2010 at 12:33:27PM -0500, Michael McCandless wrote:
> In Lucene, skipping is done through the aggregator,
I had a look at MultiDocsEnum in the flex blanch. It doesn't know when
sub-enum is reading skip data.
> > I suppose another possibility would have been to have the aggregato
On Wed, Feb 10, 2010 at 8:27 AM, Marvin Humphrey wrote:
>> But why didn't you have the Multi*Enums layer add the offset (so
>> that the codec need not know who's consuming it)? Performance?
>
> That would have involved something like this within the aggregator:
>
>posting.setDocID(posting.ge
On Wed, Feb 10, 2010 at 9:47 AM, Renaud Delbru wrote:
> On 10/02/10 13:15, Uwe Schindler wrote:
>>>
>>> Could you provide pointers to search code that uses the segment-level
>>> enum ?
>>> As I explained in my last answer to Michael, the TermScorer is using
>>> the
>>> DocsEnum interface, and ther
On 10/02/10 13:15, Uwe Schindler wrote:
Could you provide pointers to search code that uses the segment-level
enum ?
As I explained in my last answer to Michael, the TermScorer is using
the
DocsEnum interface, and therefore do not know if it manipulates
segment-level enum or a Multi*Enums. What s
On Wed, Feb 10, 2010 at 06:58:01AM -0500, Michael McCandless wrote:
> But why didn't you have the Multi*Enums layer add the offset (so that
> the codec need not know who's consuming it)? Performance?
That would have involved something like this within the aggregator:
posting.setDocID(pos
> Could you provide pointers to search code that uses the segment-level
> enum ?
> As I explained in my last answer to Michael, the TermScorer is using
> the
> DocsEnum interface, and therefore do not know if it manipulates
> segment-level enum or a Multi*Enums. What search (or query operators)
> i
On 10/02/10 09:47, Uwe Schindler wrote:
Positions as attributes would be good. For positions we need a new Attribute
(not PositionIncrement), but e.g. for offsets and payloads we can use the
standard attributes from the analysis, which is really cool. This would also
make it possible to add al
Hi Michael,
On 09/02/10 20:47, Michael McCandless wrote:
But, then, it's very convenient when you need it and don't care about
performance. EG in Renaud's usage, a test case that is trying to
assert that all indexed docs look right, why should you be forced to
operate per segment? He shouldn't
On Tue, Feb 9, 2010 at 4:44 PM, Marvin Humphrey wrote:
>> Interesting... and segment merging just does its own private
>> concatenation/mapping-around-deletes of the doc/positions?
>
> I think the answer is yes, but I'm not sure I understand the
> question completely since I'm not sure why you'd
> > And we don't return "objects or aggregates" with Multi*Enum now...
>
> Yeah, this is different. In KS right now, we use a generic
> PostingList, which
> conveys different information depending on what class of Posting it
> contains.
>
> > In flex right now the codec is unware that it's being
On Tue, Feb 09, 2010 at 03:47:19PM -0500, Michael McCandless wrote:
> Interesting... and segment merging just does its own private
> concatenation/mapping-around-deletes of the doc/positions?
I think the answer is yes, but I'm not sure I understand the question
completely since I'm not sure why y
On Tue, Feb 9, 2010 at 1:12 PM, Marvin Humphrey wrote:
> On Tue, Feb 09, 2010 at 11:51:31AM -0500, Michael McCandless wrote:
>
>> You should (when possible/reasonable) instead use
>> ReaderUtil.gatherSubReaders, then iterate through those sub readers
>> asking each for its flex fields.
>
>> But if
On Tue, Feb 09, 2010 at 11:51:31AM -0500, Michael McCandless wrote:
> You should (when possible/reasonable) instead use
> ReaderUtil.gatherSubReaders, then iterate through those sub readers
> asking each for its flex fields.
>
> But if this is only for testing purposes, and Multi*Enum is more
> c
On Tue, Feb 9, 2010 at 11:35 AM, Renaud Delbru wrote:
>> This particular patch doesn't change the Codecs API -- it "only"
>> factors out the Multi* APIs from MultiReader. Likely you won't need
>> to change your codec... but try applying the patch and see :)
>>
>
> Ok, good news ;o).
Flex is sti
On 09/02/10 16:04, Michael McCandless wrote:
On Tue, Feb 9, 2010 at 9:08 AM, Renaud Delbru wrote:
So, does it mean that the codec interface is likely to change ? Do I need to
be prepared to change again all my code ;o) ?
This particular patch doesn't change the Codecs API -- it "only
On Tue, Feb 9, 2010 at 9:08 AM, Renaud Delbru wrote:
> Hi Michael,
>
> On 09/02/10 13:35, Michael McCandless wrote:
>>
>> It's great that you're testing the flex APIs... things are still "in
>> flux" as you've seen. There's another big patch pending on
>> LUCENE-2111...
>>
>
> So, does it mean th
Hi Michael,
On 09/02/10 13:35, Michael McCandless wrote:
It's great that you're testing the flex APIs... things are still "in
flux" as you've seen. There's another big patch pending on
LUCENE-2111...
So, does it mean that the codec interface is likely to change ? Do I
need to be prepared t
Renaud,
It's great that you're testing the flex APIs... things are still "in
flux" as you've seen. There's another big patch pending on
LUCENE-2111...
Out of curiosity... in what circumstances do you see a Multi*Enum appearing?
Lucene's core always searches "by segment". Are you doing somethin
Hi Renaud,
> On 09/02/10 12:16, Uwe Schindler wrote:
> > In flex the correct way to add additional posting data to these
> classes would be the usage of custom attributes, registered in the
> attributes() AttributeSource.
> >
> Ok, I have changed my codes to use the AttributeSource interface.
>
Hi Uwe,
On 09/02/10 12:16, Uwe Schindler wrote:
In flex the correct way to add additional posting data to these classes would
be the usage of custom attributes, registered in the attributes()
AttributeSource.
Ok, I have changed my codes to use the AttributeSource interface.
Due to some l
February 09, 2010 1:05 PM
> To: java-user
> Cc: Michael McCandless
> Subject: Flex & Docs/AndPositionsEnum
>
> Hi Michael,
>
> I have updated my lucene-1458, and I discovered there was big
> modifications in the StandardCodec interface.
> I updated my own codecs to t
Hi Michael,
I have updated my lucene-1458, and I discovered there was big
modifications in the StandardCodec interface.
I updated my own codecs to this new interface, but I encounter a
problem. My codecs are creating DocsAndPositionsEnum subclasses that
allow to access more information than si
24 matches
Mail list logo