Try cts:tokenize() instead of fn:tokenize. Rather than using regex, this
splits text up into cts:word, cts:punctuation, and cts:whitespace items.
For example....
let $n := <p>a b c d e f g h i j k</p>
let $q := 'f'
let $words-before := 3
let $words-after := 3
return cts:highlight(
$n, $q,
element test {
let $words := cts:tokenize($cts:node)[. instance of cts:word]
let $x := index-of($words, $cts:text)[1]
let $start := $x - $words-before
let $stop := $x + $words-after
for $w in $words[$start to $stop]
return
if ($w ne $cts:text) then $w
else element span { attribute class { 'highlight' }, $w }
}
)/test/node()
=>
c d e
<span class="highlight">f</span>
g h i
You might be able to improve on that, but hopefully it gives you the
general idea.
-- Mike
David Sewell wrote:
cts:highlight() will do the highlighting around your search phrase, but
in order to create a partial context around the first match in the
paragraph it's necessary to write a fairly complex ad hoc function,
probably relying on fn:tokenize() to establish the context and then
fn:string-join() to recombine the selected context into something that
can be passed to cts:highlight().
We wrote some code to do just that for a publication, and it works well,
but it's a bit messy and could no doubt be optimized and made more
general-purpose. I'd hesitate to post it to the whole list in its
current state--maybe someone out there has an elegant general-purpose
function to do this task--but I can share it offline if you like.
David S.
On Wed, 20 Aug 2008, Mindie Sorenson wrote:
I’m trying to limit the number of words that are displayed for each search
result. For
example, I need to have the results formatted like this:
... we have the opportunity and also the responsibility to help restore faith
in the
family. ...
I have been able to get the first paragraph from each document that contains
the search
phrase (responsibility), but I haven’t been able to successfully limit the
words to 7
words before the search phrase and 7 after the phrase. I have looked through
the
cts:search functions, but I haven’t been able to find quite what I’m looking
for. Is
there a function that will help me do this?
Thanks
Mindie
________________________________________________________________________________________________
NOTICE: This email message is for the sole use of the intended recipient(s) and
may
contain confidential and privileged information. Any unauthorized review, use,
disclosure
or distribution is prohibited. If you are not the intended recipient, please
contact the
sender by reply email and destroy all copies of the original message.
------------------------------------------------------------------------
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general