|
Mike, "(AUTHOR:john AUTHOR:smith)" is doing what I expect, but I need a different query in addition to this one. Let me give you some context. Suppose that we have two documents: <DOCUMENT>The problem with the query above is that it will match both documents, but I want a query that matches only the first document. In other words, I want the AND to be performed within any single AUTHOR field. The obvious syntax for this is: "AUTHOR:(john smith)". However, lib-parse appears to parse this as if the query were as follows: 'AUTHOR:"" (john smith)' - in other words, it's looking for an AUTHOR field with "" (and there is none), so the entire query returns zero results. What I expected it to do is something like: (cts:element-word-query(QName("AUTHOR"), cts:and-query(cts:word-query("john"), cts:word-query("smith"))))When I run the first query against my database, I get 7210 results. When I run the second query against my database, I get 2691 results, which is exactly what I'd expect, because the second query returns fewer results than the first query. Here's the actual query: for $i in xdmp:estimate(cts:search(input(), This seems like such an obvious thing to do that I can't believe that I'm the first one to do it, so I was hoping that someone else already had implemented something similar. I'm using MarkLogic 4.0-2.2 and lib-search-3.2-2008-05-13.1 (it looks like you're using a newer version - where do I get it?). My lib-parser-custom.xqy has some additional code to deal with range queries (appended below). At first glance, it shouldn't have an impact. -Dave 23a24,25 > import module namespace map = "http://sirma.marklogic.com/lib/map" at "/lib/map.xqy" > 55a58 > (: 82a86 > :) 114a119,210 > define function custom:hasRange($text) > { > if( contains($text,"~>") or > contains($text,"~<") or > contains($text,"~=") or > contains($text,"~>=") or > contains($text,"~<=") ) then true() > else false() > } > > (: [EMAIL PROTECTED] :) > define function custom:rangeQuery($qname,$text,$type) > { > let $tokens := tokenize($text,"~") > let $optr := $tokens[2] > let $val := custom:castToDataType($tokens[1],$type) > return cts:element-range-query(xs:QName($qname),$optr,$val, element cts:option { "collation=http://marklogic.com/collation//MO" }) > } > > define function custom:element-value-query( > $qnames as xs:QName*, > $text as xs:string*, > $options as xs:string*, > $weight as xs:double > ) { > let $queries := > for $i in $qnames > let $mapping := $map:map//*:[EMAIL PROTECTED] = $i] > let $parent := string($mapping/@parent) > let $range := string($mapping/@isRange) > let $type := string($mapping/@type) > return > if($range and custom:hasRange($text)) then custom:rangeQuery($i,$text,$type) > else if($parent) then > cts:element-query(xs:QName($parent), cts:word-query($text, $options, $weight)) > else > cts:element-value-query(xs:QName($i), $text, $options, $weight) > return > if(count($queries) gt 1) then cts:or-query($queries) else $queries > } > > define function custom:element-word-query( > $qnames as xs:QName*, > $text as xs:string*, > $options as xs:string*, > $weight as xs:double > ) { > let $queries := > for $i in $qnames > let $mapping := $map:map//*:[EMAIL PROTECTED] = $i] > let $parent := string($mapping/@parent) > let $range := string($mapping/@isRange) > let $type := string($mapping/@type) > return > if($range and custom:hasRange($text)) then custom:rangeQuery($i,$text,$type) > else > cts:element-query(xs:QName($i), cts:word-query($text, $options, $weight)) > (:if($parent) then > else > cts:element-word-query(xs:QName($i), $text, $options, $weight):) > return > if(count($queries) gt 1) then cts:or-query($queries) else $queries > } > > define function custom:element-query($qnames, $queries, $options){ > let $queries := > for $i in $qnames > return > cts:element-query(xs:QName($i), $queries, $options) > return > if(count($queries) gt 1) then cts:or-query($queries) else $queries > } > > define function custom:castToDataType($value,$type) > { > try > { > let $value := replace($value,'"','') > return > if($type="dateTime") then xs:dateTime(xs:date($value)) else > if($type="unsignedLong") then xs:unsignedLong($value) else > if($type="int") then xs:int($value) else > if($type="unsignedInt") then xs:unsignedInt($value) else > if($type="") then $value else > lp:error( concat("Unknown type",":"),$value) > } > catch($ex) > { > lp:error(concat("Invalid dataType",":"),$value) > } > } > Message: 6 Date: Thu, 11 Dec 2008 08:44:57 -0800 From: Michael Blakeley <[EMAIL PROTECTED]> Subject: Re: [MarkLogic Dev General] lib-parse - how to search for boolean expressions within a field To: General Mark Logic Developer Discussion <[email protected]> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=UTF-8; format=flowedlib-parser tries to emulate google's syntax. It does not implement "AUTHOR:(john AND smith)" because that isn't google syntax. You are, of course, free to write your own parser, and you can even use the apache-licensed lib-parser.xqy code as a starting point. But first let's look into "(AUTHOR:john AUTHOR:smith)" some more. I suspect that you might be tickling a bug or misunderstanding an option (or perhaps you had an extra space before "john"?). Here's my test, using MarkLogic Server 4.0-2.2 and lib-parser version 3.2-2008-10-08.1 with the built-in code mapping: import module namespace lp="http://www.marklogic.com/ps/lib/lib-parser" at "lib-parser.xqy"; lp:get-cts-query('title:foo title:bar') => cts:and-query((cts:element-word-query(QName("", "title"), "foo", ("lang=en"), 1), cts:element-word-query(QName("", "title"), "bar", ("lang=en"), 1)), ()) That's what I'd expect: the output is an and-query of element-word-query terms (*not* element-query). Note that I omitted AND, because it's a no-op (as with google's syntax). From the collection-query in your sample output, it's clear that your test case must be more complex than mine. Can you provide a full test case? What result did you expect, and what are you getting? Which version of the server are you using? What version of lib-parser.xqy do you have? Have you made any changes to lib-parser-custom.xqy? -- Mike On 2008-12-11 00:07, Dave Feldmeier wrote: |
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
