Hi, I have been studying the Lucene indexing code for a bit. I am not sure if I understand the problem scope completely, but, storing extra information using TermsInfoWriter may not solve the problem?
For the example of XML document tag depth, could that be a seperate field? Because Lucene term is a combination of (field, termText), so, depth could be a field and even though two XML tags are the same, if their depths are different, they are still treated as separate terms. This is what I could think about so far. Jian On 10/10/05, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > http://wiki.apache.org/jakarta-lucene/Lucene2Whiteboard > > See item #11 of API changes. Maybe along the lines of what you are > interested in, although I don't know if anyone has even attempted a design > of it. I would also like to see this, plus the ability to store info at > higher levels in the Index, such as Field (not on a per token basis), > Document (info about the document that spans it's fields) and Index (such > as > coreference information). Alas, no time... > > -Grant > > >-----Original Message----- > >From: Shane O'Sullivan [mailto:[EMAIL PROTECTED] > >Sent: Monday, October 10, 2005 8:38 AM > >To: java-dev@lucene.apache.org > >Subject: Adding generic payloads to a Term's posting list > > > >Hi, > > > >To the best of my knowledge, it is not possible to add generic > >data to a Term's posting list. > >By this I mean info that is defined by the search engine, not > >Lucene itself. > >Whereas Lucene adds some data to the posting lists, such as > >the term's position within a document, there are many other > >useful types of information that could be attached to a term. > > > >Some examples would be in XML documents, to store the depth of > >a tag in the document, or font information, such as if the > >term appeared in a header or in the main body of text. > > > >Are there any plans to add such functionality to the API? If > >not, where would be a the appropriate place to implement these > >changes? I presume the TermInfosWriter and TermInfosReader > >would have to be altered, as well as the classes which call > >them. Could this be done without having to modify the index in > >such a way that standard Lucene indexes couldn't read it? > > > >Thanks > > > >Shane > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >