We find users generally prefer a "jump-to" feature rather than a
filter. Ie - rather than showing all results starting with letter "X"
(there might not be any), show all results sorted alphabetically,
beginning at the first result sorting >= "X". This is just a mechanism
for positioning by value within a large result list, rather than by page
number. To accomplish this we use a range index on a processed sortkey
of some sort (the range index can handle diacritic normalization). The
cts:element-range-query() function is at the heart of this; it's nice
because it can be combined with other query parameters.
I'm not clear whether this approach would fit well with the
search:search api, though.
Cheers
-Mike
PS - some difficulties arise if you want to be able to page backwards
from the starting point - show the last 20 entries at the end of the W's
in the example above - but it's all do-able.
On 3/3/2011 9:55 AM, Murray, Gregory wrote:
> Hello,
>
> I'm developing a web application in which I want to provide a browse-by-title
> feature with an alpha wheel -- by which I mean a row of links labeled A, B,
> C, etc. that allow the user to click on a letter and get back all titles
> starting with that letter. Under the hood I need to implement this feature as
> a search. That is, I don't want to retrieve all titles and then filter out
> the titles starting with a particular letter. That won't scale, and it just
> seems so inelegant and overwrought. It also won't work with my pagination
> code, where I provide links allowing the user to page through the results in
> pages of $page-length, because that code relies on the<search:response>
> document that search:search() returns (including relying on @total).
>
> Also, ideally the solution to this problem should include Unicode
> normalization, specifically decomposition. Currently we're building a pilot
> project, but we expect to have tens of thousands of documents eventually,
> some of which might not be in English. I need the search results to include
> documents where the first letter of the title starts with a non-ASCII
> character, such as a letter with a diacritical mark. Simply put, when the
> user clicks the "E" link I need to retrieve titles starting with "E" but also
> ones starting with "É" etc.
>
> In the database config, I've got an element range index on the relevant
> element, which in our documents is<sortTitle>, containing the title with
> initial articles like a/an/the stripped off.
>
> My first thought was to modify the XML documents themselves to include an
> attribute containing the (Unicode normalized) first letter of<sortTitle>. I
> assume that would allow me to set up an attribute index and base my searches
> on that, as in search:search("first-letter:A", ...). But I consider that a
> last resort; I'd much prefer to handle this within the application rather
> than updating the documents.
>
> I thought that using the * wildcard might work, but I haven't been able to
> hit upon the right mix of index(es), word lexicon(s), and database config
> settings to make that idea work.
>
> Thanks in advance for any advice!
> Greg
>
> Gregory Murray
> Digital Library Application Developer
> Princeton Theological Seminary Library
> [email protected]
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general