Found public int getPositionIncrementGap(String fieldName) on Analyzer. Sweet! Should've read more before emailing.
tjk :) On Fri, Jan 15, 2010 at 10:19 AM, TJ Kolev <tjko...@gmail.com> wrote: > Hi! > > I don't think the easy solution will work for me, because I'll have more > than two fields in a group - perhaps 6 - 10. > > However using span queries looks very promising. I'll investigate that. > > I see setPositionIncrement() only on the Token object. Is there a way to > set this when adding a field to the document, so that the first token get > its position pushed away. I would prefer not to modify my analyzer if > possible. > > Thank you. > tjk :) > > > On Wed, Jan 13, 2010 at 3:52 PM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> Ooooh, isn't that easier. You just prompted me to think >> that you don't even have to do that, just index the pairs as single >> tokens (KeywordAnalyzer? but watch out for no case folding)... >> >> On Wed, Jan 13, 2010 at 4:30 PM, Digy <digyd...@gmail.com> wrote: >> >> > How about using languages as fieldnames? >> > Doc1(Ra): >> > Java:5 >> > C:2 >> > PHP:3 >> > >> > Doc2(Rb) >> > Java:2 >> > C:5 >> > VB:1 >> > >> > Query:Java:5 AND C:2 >> > >> > DIGY >> > >> > -----Original Message----- >> > From: TJ Kolev [mailto:tjko...@gmail.com] >> > Sent: Wednesday, January 13, 2010 11:00 PM >> > To: java-user@lucene.apache.org >> > Subject: Problem: Indexing and searching repeating groups of fields >> > >> > Greetings, >> > >> > Let's assume I have to index and search "resume" documents. Two fields >> are >> > defined: Language and Years. The fields are associated together in a >> group >> > called Experience. A resume document may have 0 or more Experience >> groups: >> > >> > Ra{ E1(Java,5); E2(C,2); E3(PHP,3);} >> > Rb{ E1(Java,2); E2(C,5); E3(VB,1);} >> > >> > How do I index such documents, and how do I search, so I can formulate a >> > query like this "Resumes which have (Java,5) and (C,2)" and get back Ra. >> I >> > know I can index multiple fields of the same name, and do >> "(Language:Java >> > AND Years:5) AND (Language:C AND Years:2)", but in addition to Ra that >> > would >> > also return Rb, which I don't want. The problem here is that the >> "grouping" >> > is lost. I can create fields with compound names (E1Language, E1Years, >> > E2Language, E2Years, etc), but that helps me none, as I don't know which >> > group to search. I'd also like to query for "(Language:Java AND Years:5) >> OR >> > (Language:C AND Years:2)" >> > >> > This is a simplified example. Real documents may have 30 - 40 groups, >> each >> > one with several fields. Putting all the fields in a group in one index >> > field won't work as the numeric/date ones should be available for range >> > searchers. >> > >> > So far the way I see it is to do my own post processing on the results. >> The >> > issue is that text fields will need to be untokenized, or otherwise it >> > would >> > be difficult to work on the result, and determine what matches. >> > >> > Thank you. >> > tjk :) >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> > >