I think you should file the report.  When you do, important to know whether 
<rs> is configured for phrase-through or phrase-around (I suspect yes).

________________________________________
From: [email protected] 
[[email protected]] On Behalf Of David Sewell 
[[email protected]]
Sent: Tuesday, May 08, 2012 7:35 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Anyone noticing buggy snippeting behavior 
with search:search()?

5.0-3 (RHEL)

On Tue, 8 May 2012, Colleen Whitney wrote:

> David, what version are you running?
> ________________________________________
> From: [email protected] 
> [[email protected]] On Behalf Of David Sewell 
> [[email protected]]
> Sent: Tuesday, May 08, 2012 7:24 AM
> To: General Mark Logic Developer Discussion
> Subject: [MarkLogic Dev General] Anyone noticing buggy snippeting behavior    
>   with search:search()?
>
> All,
>
> I'm probably going to submit a formal bug report on this class of problem, but
> I'm just wondering whether other users out there have noticed similar 
> phenomena.
> The default snippeting behavior in search:search() tends to do the wrong thing
> in cases where the matched text is in an element (say a <p> or <para>) that 
> has
> mixed content. For example, consider this paragraph from our data:
>
> <p>Charles Yancey (1766–ca. 1825) was a magistrate of <rs>Albemarle 
> County</rs>
> from 1796, colonel in the local militia, 1806–15, and sheriff, 1821–23. He
> represented the county in the <name>Virginia House of Delegates</name>, 
> 1814–17.
> Yancey also operated a tavern, store, mill, and distillery. He corresponded
> regularly with TJ on subjects ranging from procurement of clover seed and
> millstones to matters under consideration by the <name>General 
> Assembly</name>,
> including the incorporation of <name>Central College</name> [... etc.]</p>
>
> Running search:search() with a simple query on
>
>        "Central College"
>
> as a phrase produces the snippet result (omitting @path):
>
> <search:match>Charles Yancey (1766–ca. 1825) was a magistrate of 
> <search:highlight>Central College</search:highlight> </search:match>
>
> Note that "was a magistrate of Central College" misrepresents the text. There
> should be an ellipsis after "magistrate of".
>
> Removing the <rs> tag from "Albemarle County" in the source eliminates the 
> buggy
> output, so there's definitely an interaction with embedded elements going on.
> I'm just wondering if others have noticed similar behavior with their content.
>
> David S.
>
> --
> David Sewell, Editorial and Technical Manager
> ROTUNDA, The University of Virginia Press
> PO Box 400314, Charlottesville, VA 22904-4314 USA
> Email: [email protected]   Tel: +1 434 924 9973
> Web: http://rotunda.upress.virginia.edu/
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>

--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to