That gives me some ideas thanks. Although it does pre-suppose that those terms will match *any* result otherwise I'll have links to possibly 0 results.
But I could see how i could use cts:walk to do this ... maybe optimize by using the cts:exists function ... From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: Wednesday, November 18, 2009 6:09 PM To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] FreeText searching - Take Two I think you will have to do a second search for this, but that is not necessarily bad. How about if you make the value of your ???? be the URI of a document that is the result of the second search? Something like: cts:highlight($paragraph, $query, <a href="{xdmp:node-uri(cts:search(collection(), $query)[1])}">{$cts:text}</a>) Note that you will have to choose one (I chose the first in relevance order) if you want a single URI to link to, or else make several links (for the first 5, for example). Or you can simply have the link run the new search and give back a list of relevance-ranked results. To have it run the new search, the link might be to an xqy file, passing it the appropriate parameters so it can run that search. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Lee, David Sent: Wednesday, November 18, 2009 2:39 PM To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] FreeText searching - Take Two Thanks but I think I'm missing something really subtle (or really obvious !) I dont want to hilight the results of the search ... I want to hilight the results of a *previous* search given a new search. I think this is hard to explain. Let me try again. Suppose a *previous search* got me a document with an element with this string which I am displaying "The anterior glandular lobe of the pituitary gland, also known as the adenohypophysis. It secretes the ADENOHYPOPHYSEAL HORMONES that regulate vital functions such as GROWTH; METABOLISM; and REPRODUCTION." Now I want to find within this string matches such as "GROWTH" across the entire database, then link "GROWTH" to the result of that search, but "METABOLISM" to the results of that matches to "METABOLISM" So suppose I do a cts:search on "GROWTH" , "ADENOHYPOPHYSEAL HORMONES" , "METABOLISM" and get a bunch of results to many different documents The result I want would ultimately be The anterior glandular lobe of the pituitary gland, also known as the adenohypophysis. It secretes the <A href=" get.xquery?doc3>ADENOHYPOPHYSEAL HORMONES</A> that regulate vital functions such as <A href="get.xquery?doc1">GROWTH</A>;<A href="get.xquery?doc1"> METABOLISM</A>; and REPRODUCTION." I dont want to highlight the results of the search, I want to highlight the original text , and replace say the document-uri of the corresponding hit for that keyword. So if I did as suggested let $paragraph = "... the above paragraph" $query = (list of upper case words in $paragraph) for $res in cts:search(collection(), $query) return cts:highlight($paragraph, $query, <A href="?????">{$cts:text}</A>) Whats the missing ???? Where do I find which result corresponded to which word in the original search ? I cant find any API that given a search result points into which part of the query it matched. The only way I can figure to do this is to do a new cts:search() for each of the terms one by one that I want to match. From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: Wednesday, November 18, 2009 4:46 PM To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] FreeText searching - Take Two And when you do the cts:highlight, you typically do it on a subset of the results (the first page, for example) so that you do not need to get every result from your search, which as you said, would be expensive. So the psuedo-code is something like: let $query := "hello" for $res in cts:search(collection(), $query)[1 to 10] return cts:highlight($res, $query, <b>{$cts:text}</b>) The Search API (search:search, for example) does a lot of this for you, so you might want to look into that for a more complete solution. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Mike Sokolov Sent: Wednesday, November 18, 2009 1:30 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] FreeText searching - Take Two If I understood you right, what you want is: let $query := cts:word-query ("ADENOHYPOPHYSEAL HORMONES", "GROWTH", ...) for $result in cts:search ( , $query) return cts:highlight ($result, $query, <match>$cts:text</match>) and then arrange for <match> nodes to be highlighted in your output? You'd guest a list of (whatever you searched for - documents?) with the highlighting applied... I'm confused because you seem to have already written that almost, so maybe I'm missing something? maybe what you're looking for is more consolidated output like: return <result uri="{base-uri($result)}">{cts:highlight ( ... ) // hit}</result> ? -Mike Lee, David wrote: This is a refinement on the question I asked the other day. I'm getting better at formulating my questions so maybe the advise might be closer :) Suppose I do a query and get a XML document which has a field that has text that looks like this : "The anterior glandular lobe of the pituitary gland, also known as the adenohypophysis. It secretes the ADENOHYPOPHYSEAL HORMONES that regulate vital functions such as GROWTH; METABOLISM; and REPRODUCTION." Those things in all upper case are likely terms that exist in other documents. What I'd like to do is to do a search for each of those terms across the entire DB, and if found, create links to either the highest scored find, or to a results page (either will do). I've poked around and found many little things that are part of a possible solution, but nothing that does exactly what I want. For example: cts:hilight() could be used to add the links to the words. and I could find a consolidated result set by using cts:search( ... , cts:word-query ( fn:tokenize(phrase) ) ) to find all matches etc. I an easily get all the upper case words and create a search. But my problem is this. Suppose I create a search on all the upper case words ("ADENOHYPOPHYSEAL" , "GROWTH" , "METABOLISM") and get a result back with cts:search() How can I match up Which nodes of the result matched which word so I can hilight them ? e.g if I did for $result in cts:search( ..... all the words ) // Which word did $result match ??? My only thing I can think of is that i would have to iteratively loop through the terms and do a search one by one. for $word in ( big oh list of search words ) for $result in cts:search( ... , $word ) cts:hilight( $phrase , $word , { link the word } ) My guess is that this will perform horribly. I'd rather get a single consolidated search then do some magic like for $result in cts:search( ... , all the words ) cts:hilight( $phrase , the word that matched $result ) Does this make any sense ? Is there an API or design pattern to do this ? Or should I do the outer loop instead ? I looked at cts:walk but it looked like it to use it for this would still involve looping on the cts:search() for each term matched. Thanks for any advise. ---------------------------------------- David A. Lee Senior Principal Software Engineer Epocrates, Inc. [email protected] 812-482-5224 ________________________________ _______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
