[MarkLogic Dev General] Anyone noticing buggy snippeting behavior with search:search()?

David Sewell Tue, 08 May 2012 07:24:52 -0700

All,

I'm probably going to submit a formal bug report on this class of problem, butI'm just wondering whether other users out there have noticed similar phenomena.The default snippeting behavior in search:search() tends to do the wrong thingin cases where the matched text is in an element (say a <p> or <para>) that hasmixed content. For example, consider this paragraph from our data:

<p>Charles Yancey (1766–ca. 1825) was a magistrate of <rs>Albemarle County</rs>from 1796, colonel in the local militia, 1806–15, and sheriff, 1821–23. Herepresented the county in the <name>Virginia House of Delegates</name>, 1814–17.Yancey also operated a tavern, store, mill, and distillery. He correspondedregularly with TJ on subjects ranging from procurement of clover seed andmillstones to matters under consideration by the <name>General Assembly</name>,including the incorporation of <name>Central College</name> [... etc.]</p>


Running search:search() with a simple query on

        "Central College"

as a phrase produces the snippet result (omitting @path):

<search:match>Charles Yancey (1766–ca. 1825) was a magistrate of <search:highlight>Central 
College</search:highlight> </search:match>

Note that "was a magistrate of Central College" misrepresents the text. Thereshould be an ellipsis after "magistrate of".

Removing the <rs> tag from "Albemarle County" in the source eliminates the buggyoutput, so there's definitely an interaction with embedded elements going on.I'm just wondering if others have noticed similar behavior with their content.


David S.

--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: [email protected]   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

[MarkLogic Dev General] Anyone noticing buggy snippeting behavior with search:search()?

Reply via email to