one minor correction... (!) > > 2. What the query string suppose to be if I want to get records which > > contain (Australia and 20020415) and (not (HongKong and 20020315))? > > ((Australia +tagname:country) AND (+tagname:date +20020415)) AND > !(( tagname:country HongKong) AND (tagname:date 20020415))
-B ----- Original Message ----- From: "Brandon Jockman" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Monday, May 13, 2002 10:31 AM Subject: Re: Search on XML files > Fanny, > > The current implementation allows for searching on: > > a.. the entire PCDATA content of an XML document. > b.. the PCDATA content within specific elements. > c.. processing instructions by name and content. > d.. attributes of elements by both name and value. > e.. elements/PIs with specific parent element types. > f.. elements/PIs at specific child locations within a parent element. > g.. elements/PIs with specific ancestor element types. > h.. elements/PIs with specifically ordered ancestor element type. > > The original need we had for XML contextual searching was to find a specific > document that contained a particular element with particular content, and in > relationships to other element types. > > Currently, searching for a document based on content of two separate > elements with a logical AND relationship is not provided. However, the OR > relationship should work just fine. > > There is a field stored that contains all text content for the document, but > that probably isn't enough for what you need. > > Each lucene document from the same XML document has a 'docid' field. > > > You have two real options: > > 1. Write a queryparser that inherits from the Lucene one that detects the > relationship and performs more than one search, grouping results based on > document id. > > Searching for X and Y would become: > 1. Search for X -> Hits_X > 2. Search for Y -> Hits_Y > 3. Merge Hits_X and Hits_Y based on docid. > > -=- > > 2. Write a queryparser that inherits from the lucene one, detects that you > are searching for a document based on several elements, as opposed to a > single one, and converts the search from: > > X AND Y > > to: > > (X AND docid:docidentifier) OR (Y AND docid:docidentifier) > > ..and then merge results based on docid. > > > You may also be able to leverage the search 'Filtering' mechanism, but I'm > not experienced with that... > > <<<From FAQ>>> > 16. What is filtering and how is it performed ? > Filtering means imposing additional restriction on the hit list to eliminate > hits that otherwise would be included in the search results. There are two > ways to filter hits: > > a.. Search Query - in this approach, provide your custom filter object to > the when you call the search() method. This filter will be called exactly > once to evaluate every document that resulted in non zero score. > b.. Selective Collection - in this approach you perform the regular search > and when you get back the hit list, collect only those that matches your > filtering criteria. In this approach, your filter is called only for hits > that returned by the search method which may be only a subset of the non > zero matches (useful when evaluating your search filter is expensive). > <<< ... >>> > > > 1. What the query string suppose to be if I want to get records which > > contain (Austalia and 20020415) or (HongKong and 20020315)? > > ((Australia +tagname:country) AND (+tagname:date +20020415)) OR ((HongKong > +tagname:country) AND (tagname:date +20020415)) > > > 2. What the query string suppose to be if I want to get records which > > contain (Australia and 20020415) and (not (HongKong and 20020315))? > > ((Australia +tagname:country) AND (+tagname:date +20020415)) AND > (( tagname:country HongKong) AND (tagname:date 20020415)) > > Either of these queries will require the additional functionality outlined > in options 1 or 2 above. > > > Regards, > > -Brandon > > Brandon Jockman > ISOGEN International, LLC. > [EMAIL PROTECTED] > > > > ----- Original Message ----- > From: "Fanny Yeung" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Monday, May 13, 2002 7:48 AM > Subject: Search on XML files > > > > Hi, > > > > Does anyone know how to make up the query for multiple fields search on > XML > > files in the sample provided by isogen? Does it support? > > > > I would like to get all the results which contain the value of 'Australia' > > in tag 'country' AND the date is '20020415' in the tag 'date'. I always > get > > 0 hit count. Any problem of my query string? > > > > +(Australia AND tagname:country) AND +(20020415 AND tagname:date) > > > > 1. What the query string suppose to be if I want to get records which > > contain (Austalia and 20020415) or (HongKong and 20020315)? > > 2. What the query string suppose to be if I want to get records which > > contain (Australia and 20020415) and (not (HongKong and 20020315))? > > > > Since I am a newbie on Lucene, I am wonder whether I can use filter to > > restricts the search results? In my case, I need to retrieve all the news > > between a date range (for example, 20020102 to 20020330). In addition, the > > result should only contains those news that have been subscribed . Should > I > > use filter to filter out the unsubscribed news? Or I should make up a > query > > string to include those subscribed news? Which approach is better in terms > > of performance? > > > > Thanks in advance. > > > > > > Fanny > > > > _________________________________________________________________ > > MSN Photos is the easiest way to share and print your photos: > > http://photos.msn.com/support/worldwide.aspx > > > > > > -- > > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > > > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
