>> For an example, in the phrase "A man saw a elephant" "saw" has annotations as
>> follows (we also say that its position in index is 1234):
>>
>> {lemma: see, pos: verb, tense: past}, {lemma: saw, pos: noun, number:
>> singular}
>>
>> I think, it would be more effective to insert parse index in
was. Maybe the linking can be done via Payloads
(offsets in the original text)? If I want to store multiple things at the
same startOffset then I just use something like SynonymFilter?
stephen
On 12/21/12 6:45 AM, "Michael McCandless" wrote:
> On Thu, Dec 20, 2012 at 3:54 PM, Wu, Ste
> If you stuff the end of the span into the payload you'd have to create
> a custom variant of PhraseQuery to properly match based on the end
> span.
How different is this from the functionality already avaialable through
SpanQuery?
stephen
--
t I want to do, and the things that can now be done in
> GATE very easily, would be possible using Mike's suggested method.
>
>
> -Glen
>
> On Thu, Dec 13, 2012 at 6:27 AM, Michael McCandless
> wrote:
>> On Wed, Dec 12, 2012 at 3:02 PM, Wu, Stephen T., Ph.D.
>>
>> Is there any (preliminary) code checked in somewhere that I can look at,
>> that would help me understand the practical issues that would need to be
>> addressed?
>
> Maybe we can make this more concrete: what new attribute are you
> needing to record in the postings and access at search time?
I’ve been trying to do semi-structured queries & query parsing. In other
words, you could have XML snippets mixed in with plain terms, e.g. a query like:
christmas tree
where you’re looking for a document with the terms “christmas” “tree” but also
some structured data about where (pract
ome
> APIs do expose this, it's not very well explored yet (eg, you'd have
> to make a custom indexing chain to get the attributes "through"
> IndexWriter down to your codec). It would be great to make progress
> making this easier, so ideas are very welcome :)
>
&
t if you want to go with Payloads that do more than boosting a
> term there's chances that you'll need to rewrite a big part of the query
> stack.
>
>
> Le 27/11/2012 16:59, Wu, Stephen T., Ph.D. a écrit :
>> I think we're looking at doing something related. I
I think we're looking at doing something related. I haven't explored the
Enums or know how to make a postings codec... But what is "flexible
indexing" in Lucene 4.0 if it's not the ability to make new postings codecs?
We're trying to incorporate attributes onto terms/spans in indexes. We'd
also